Skip to content

Prevent indexing of non-stable docs versions and publish stable-only crawl policy#14800

Open
Copilot wants to merge 4 commits intomasterfrom
copilot/add-meta-robots-noindex
Open

Prevent indexing of non-stable docs versions and publish stable-only crawl policy#14800
Copilot wants to merge 4 commits intomasterfrom
copilot/add-meta-robots-noindex

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented May 7, 2026

☑️ Resolves

This sparked from a discussion in another ticket where users were seeing old versions docs in the first results of some search engines. Usually Google is smart enough, but sometimes it's not great.
I think we should maybe prevent indexing old docs and only let stable poke through ?

Search engines are indexing multiple version paths (/latest, /stable, /N/), creating duplicate/outdated results when canonical tags are ignored. This change makes non-stable pages explicitly non-indexable and adds a root crawl policy that keeps only server/stable crawlable.

  • Shared HTML meta policy

    • Added _shared_assets/templates/layout.html to inject:
    <meta name="robots" content="noindex, follow" />

    on generated Sphinx pages across manuals.

  • Deploy-time stable exception + robots policy

    • In .github/workflows/sphinxbuild.yml (deploy job), after staged artifacts are applied:
      • strip noindex from server/stable/**/*.html (stable remains indexable),
      • write root robots.txt:
    User-agent: *
    Allow: /server/stable/
    Disallow: /server/
  • Link-check behavior

    • Keep the existing canonical-link stripping before lychee runs.
    • Removed the additional noindex stripping in link-check prep because it is unnecessary for lychee.

🖼️ Screenshots

N/A (workflow + template-only changes; no rendered page layout/content changes)

✅ Checklist

  • I have built the documentation locally and reviewed the output
  • Screenshots are included for visual changes
  • I have not moved or renamed pages (or added a redirect if I did)
  • I have run codespell or similar and addressed any spelling issues

@github-actions github-actions Bot added the github_actions Pull requests that update GitHub Actions code label May 7, 2026
Copilot AI changed the title [WIP] Add meta robots tag to suppress indexing of non-stable builds Prevent indexing of non-stable docs versions and publish crawl policy for stable-only SEO May 7, 2026
Copilot AI requested a review from skjnldsv May 7, 2026 14:39
Comment thread .github/workflows/sphinxbuild.yml Outdated
# Only the stable version should be indexed
Allow: /server/stable/
Disallow: /server/
Sitemap: https://docs.nextcloud.com/sitemap.xml
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot we do not have a sitemap

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 92b0367: I removed the Sitemap: line from the deploy-generated robots.txt in .github/workflows/sphinxbuild.yml.

Copilot AI changed the title Prevent indexing of non-stable docs versions and publish crawl policy for stable-only SEO Prevent indexing of non-stable docs versions and publish stable-only crawl policy May 7, 2026
Copilot AI requested a review from skjnldsv May 7, 2026 16:33
Copilot stopped work on behalf of skjnldsv due to an error May 7, 2026 16:50
Comment thread .github/workflows/sphinxbuild.yml Outdated
run: |
find "stage/${{ needs.stage-and-check.outputs.branch_name }}" -name '*.html' -print0 | while IFS= read -r -d '' f; do
perl -0pi -e 's{^\s*<link rel="canonical" href="https://docs\.nextcloud\.com/server/[^"]*" />\n}{}m' "$f"
perl -pi -e 's{<meta name="robots" content="noindex, follow" />\n?}{}g' "$f"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot explain why this is needed

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch—this noindex removal in the link-check step was not actually needed for lychee (it checks links, and this meta tag does not affect link extraction). I removed it in 4b73b25 and kept only canonical stripping there. (No UI changes; workflow-only.)

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 7, 2026

📖 Documentation Preview

🔍 Open preview →

No RST documentation pages changed in this PR.

Last updated: Thu, 07 May 2026 17:24:10 GMT

Copilot AI requested a review from skjnldsv May 7, 2026 17:09
@skjnldsv skjnldsv marked this pull request as ready for review May 7, 2026 17:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

github_actions Pull requests that update GitHub Actions code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants