Skip to content

docs: Use static sitemap and enhance SEO for data professionals#2531

Open
tswast wants to merge 1 commit intomainfrom
static-sitemap-and-seo-doc-updates-5578325803453195932
Open

docs: Use static sitemap and enhance SEO for data professionals#2531
tswast wants to merge 1 commit intomainfrom
static-sitemap-and-seo-doc-updates-5578325803453195932

Conversation

@tswast
Copy link
Contributor

@tswast tswast commented Mar 25, 2026

Replaces the dynamic sphinx-sitemap plugin with a static sitemap.xml that only includes key top-level URLs. This improves SEO by directly controlling the URLs exposed to Google Search.

Additionally, key documentation pages and module docstrings have been enriched with keywords specifically targeting data professionals (data scientists, data engineers, and data analysts), detailing specific use cases like Generative AI, MLOps, scalable data pipelines, and big data manipulation to alleviate "thin content" flags by search engines.


PR created automatically by Jules for task 5578325803453195932 started by @tswast

Removes the sphinx_sitemap extension and its configuration in docs/conf.py.
Adds a static docs/sitemap.xml with the core URLs requested for Google Search
indexing, and copies it to the root using html_extra_path.

Also enriches index.rst, user_guide/index.rst, reference/index.rst,
bigframes/pandas/__init__.py, bigframes/bigquery/__init__.py, and
bigframes/bigquery/ai.py with targeted keywords emphasizing use-cases for
data scientists, data engineers, and data analysts. This addresses potential
"thin" content concerns by making the pages more informative and relevant.

Co-authored-by: tswast <247555+tswast@users.noreply.github.com>
@tswast tswast requested review from a team as code owners March 25, 2026 15:08
@google-labs-jules
Copy link
Contributor

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

@tswast tswast requested a review from TrevorBergeron March 25, 2026 15:08
@product-auto-label product-auto-label bot added size: m Pull request size is medium. api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. labels Mar 25, 2026
@tswast
Copy link
Contributor Author

tswast commented Mar 25, 2026

Doctest failures don't appear related to this change. I suspect they are due to a pandas package upgrade:

__ [doctest] third_party.bigframes_vendored.pandas.io.gbq.GBQIOMixin.read_gbq __
[gw0] linux -- Python 3.12.12 /tmpfs/src/github/python-bigquery-dataframes/.nox/doctest/bin/python
080             ...         AS rowindex,
081             ...
082             ...       pitcherFirstName,
083             ...       pitcherLastName,
084             ...       AVG(pitchSpeed) AS averagePitchSpeed
085             ...     FROM `bigquery-public-data.baseball.games_wide`
086             ...     WHERE year = 2016
087             ...     GROUP BY pitcherFirstName, pitcherLastName
088             ... ''', index_col="rowindex")
089             >>> df.head(2)
Differences (unified diff with -expected +actual):
    @@ -1,4 +1,10 @@
    +Query started. Job bigframes-testing:US.5c66e70e-4ac2-4c6c-a6e6-41f26dff2dc8 details: https://console.cloud.google.com/bigquery?project=bigframes-testing&j=bq:US:5c66e70e-4ac2-4c6c-a6e6-41f26dff2dc8&page=queryresults
    +Query finished. 0 Bytes processed. Slot time: a moment. Job bigframes-testing:US.5c66e70e-4ac2-4c6c-a6e6-41f26dff2dc8 details: https://console.cloud.google.com/bigquery?project=bigframes-testing&j=bq:US:5c66e70e-4ac2-4c6c-a6e6-41f26dff2dc8&page=queryresults
    +Load job 577b6aa7-4a00-42b2-9db6-e6a87a5d6d71 is RUNNING. 
    +https://console.cloud.google.com/bigquery?project=577b6aa7-4a00-42b2-9db6-e6a87a5d6d71&j=bq:US:577b6aa7-4a00-42b2-9db6-e6a87a5d6d71&page=queryresults
    +Load job 577b6aa7-4a00-42b2-9db6-e6a87a5d6d71 is DONE. 
    +https://console.cloud.google.com/bigquery?project=577b6aa7-4a00-42b2-9db6-e6a87a5d6d71&j=bq:US:577b6aa7-4a00-42b2-9db6-e6a87a5d6d71&page=queryresults
              pitcherFirstName pitcherLastName  averagePitchSpeed
    -rowindex
    +rowindex                                                    
     1                Albertin         Chapman          96.514113
     2                 Zachary         Britton          94.591039

/[tmpfs/src/github/python-bigquery-dataframes/third_party/bigframes_vendored/pandas/io/gbq.py:89](https://cs.corp.google.com/piper///depot/google3/tmpfs/src/github/python-bigquery-dataframes/third_party/bigframes_vendored/pandas/io/gbq.py?l=89): DocTestFailure
____________ [doctest] bigframes.operations.ai.AIAccessor.sim_join _____________
[gw4] linux -- Python 3.12.12 /tmpfs/src/github/python-bigquery-dataframes/.nox/doctest/bin/python
606             >>> bpd.options.experiments.ai_operators = True
607             >>> bpd.options.compute.ai_ops_confirmation_threshold = 25
608 
609             >>> import bigframes.ml.llm as llm
610             >>> model = llm.TextEmbeddingGenerator(model_name="text-embedding-005")
611 
612             >>> df1 = bpd.DataFrame({'animal': ['monkey', 'spider']})
613             >>> df2 = bpd.DataFrame({'animal': ['scorpion', 'baboon']})
614 
615             >>> df1.ai.sim_join(df2, left_on='animal', right_on='animal', model=model, top_k=1)
Differences (unified diff with -expected +actual):
    @@ -1,3 +1,9 @@
    -animal  animal_1
    +Query started. Job bigframes-testing:US.5cf0f2ee-b8b9-467c-aee7-aa35fd82ee1d details: https://console.cloud.google.com/bigquery?project=bigframes-testing&j=bq:US:5cf0f2ee-b8b9-467c-aee7-aa35fd82ee1d&page=queryresults
    +Query finished. 0 Bytes processed. Slot time: a moment. Job bigframes-testing:US.5cf0f2ee-b8b9-467c-aee7-aa35fd82ee1d details: https://console.cloud.google.com/bigquery?project=bigframes-testing&j=bq:US:5cf0f2ee-b8b9-467c-aee7-aa35fd82ee1d&page=queryresults
    +Load job 4485b062-759c-44ce-8e44-ee3413a47556 is RUNNING. 
    +https://console.cloud.google.com/bigquery?project=4485b062-759c-44ce-8e44-ee3413a47556&j=bq:US:4485b062-759c-44ce-8e44-ee3413a47556&page=queryresults
    +Load job 4485b062-759c-44ce-8e44-ee3413a47556 is DONE. 
    +https://console.cloud.google.com/bigquery?project=4485b062-759c-44ce-8e44-ee3413a47556&j=bq:US:4485b062-759c-44ce-8e44-ee3413a47556&page=queryresults
    +   animal  animal_1
     0  monkey    baboon
     1  spider  scorpion

/[tmpfs/src/github/python-bigquery-dataframes/bigframes/operations/ai.py:615](https://cs.corp.google.com/piper///depot/google3/tmpfs/src/github/python-bigquery-dataframes/bigframes/operations/ai.py?l=615): DocTestFailure
__________________ [doctest] bigframes.pandas.io.api.read_gbq __________________
[gw2] linux -- Python 3.12.12 /tmpfs/src/github/python-bigquery-dataframes/.nox/doctest/bin/python
302     ...         AS rowindex,
303     ...
304     ...       pitcherFirstName,
305     ...       pitcherLastName,
306     ...       AVG(pitchSpeed) AS averagePitchSpeed
307     ...     FROM `bigquery-public-data.baseball.games_wide`
308     ...     WHERE year = 2016
309     ...     GROUP BY pitcherFirstName, pitcherLastName
310     ... ''', index_col="rowindex")
311     >>> df.head(2)
Differences (unified diff with -expected +actual):
    @@ -1,4 +1,10 @@
    +Query started. Job bigframes-testing:US.67fe8f92-377a-4d91-9b5d-64dfe8f32645 details: https://console.cloud.google.com/bigquery?project=bigframes-testing&j=bq:US:67fe8f92-377a-4d91-9b5d-64dfe8f32645&page=queryresults
    +Query finished. 0 Bytes processed. Slot time: a moment. Job bigframes-testing:US.67fe8f92-377a-4d91-9b5d-64dfe8f32645 details: https://console.cloud.google.com/bigquery?project=bigframes-testing&j=bq:US:67fe8f92-377a-4d91-9b5d-64dfe8f32645&page=queryresults
    +Load job 62ec7ca9-21b3-40f1-a112-99631b44a2a4 is RUNNING. 
    +https://console.cloud.google.com/bigquery?project=62ec7ca9-21b3-40f1-a112-99631b44a2a4&j=bq:US:62ec7ca9-21b3-40f1-a112-99631b44a2a4&page=queryresults
    +Load job 62ec7ca9-21b3-40f1-a112-99631b44a2a4 is DONE. 
    +https://console.cloud.google.com/bigquery?project=62ec7ca9-21b3-40f1-a112-99631b44a2a4&j=bq:US:62ec7ca9-21b3-40f1-a112-99631b44a2a4&page=queryresults
              pitcherFirstName pitcherLastName  averagePitchSpeed
    -rowindex
    +rowindex                                                    
     1                Albertin         Chapman          96.514113
     2                 Zachary         Britton          94.591039

/[tmpfs/src/github/python-bigquery-dataframes/bigframes/pandas/io/api.py:311](https://cs.corp.google.com/piper///depot/google3/tmpfs/src/github/python-bigquery-dataframes/bigframes/pandas/io/api.py?l=311): DocTestFailure
_______________ [doctest] bigframes.pandas.io.api.read_gbq_query _______________
[gw2] linux -- Python 3.12.12 /tmpfs/src/github/python-bigquery-dataframes/.nox/doctest/bin/python
EXAMPLE LOCATION UNKNOWN, not showing all tests of that example
??? >>> df.head(2)
Differences (unified diff with -expected +actual):
    @@ -1,4 +1,10 @@
    +Query started. Job bigframes-testing:US.28acbd24-e987-46ce-9865-fc977f417989 details: https://console.cloud.google.com/bigquery?project=bigframes-testing&j=bq:US:28acbd24-e987-46ce-9865-fc977f417989&page=queryresults
    +Query finished. 0 Bytes processed. Slot time: a moment. Job bigframes-testing:US.28acbd24-e987-46ce-9865-fc977f417989 details: https://console.cloud.google.com/bigquery?project=bigframes-testing&j=bq:US:28acbd24-e987-46ce-9865-fc977f417989&page=queryresults
    +Load job b80c3185-0a70-47df-940c-8de0e5e81075 is RUNNING. 
    +https://console.cloud.google.com/bigquery?project=b80c3185-0a70-47df-940c-8de0e5e81075&j=bq:US:b80c3185-0a70-47df-940c-8de0e5e81075&page=queryresults
    +Load job b80c3185-0a70-47df-940c-8de0e5e81075 is DONE. 
    +https://console.cloud.google.com/bigquery?project=b80c3185-0a70-47df-940c-8de0e5e81075&j=bq:US:b80c3185-0a70-47df-940c-8de0e5e81075&page=queryresults
              pitcherFirstName pitcherLastName  averagePitchSpeed
    -rowindex
    +rowindex                                                    
     1                Albertin         Chapman          96.514113
     2                 Zachary         Britton          94.591039

/tmpfs/src/github/python-bigquery-dataframes/bigframes/pandas/io/api.py:None: DocTestFailure
______________ [doctest] bigframes.session.Session.read_gbq_query ______________
[gw2] linux -- Python 3.12.12 /tmpfs/src/github/python-bigquery-dataframes/.nox/doctest/bin/python
658             ...         AS rowindex,
659             ...
660             ...       pitcherFirstName,
661             ...       pitcherLastName,
662             ...       AVG(pitchSpeed) AS averagePitchSpeed
663             ...     FROM `bigquery-public-data.baseball.games_wide`
664             ...     WHERE year = 2016
665             ...     GROUP BY pitcherFirstName, pitcherLastName
666             ... ''', index_col="rowindex")
667             >>> df.head(2)
Differences (unified diff with -expected +actual):
    @@ -1,4 +1,10 @@
    +Query started. Job bigframes-testing:US.56898eab-ecd0-48e6-86e4-9f051bae1c4c details: https://console.cloud.google.com/bigquery?project=bigframes-testing&j=bq:US:56898eab-ecd0-48e6-86e4-9f051bae1c4c&page=queryresults
    +Query finished. 0 Bytes processed. Slot time: a moment. Job bigframes-testing:US.56898eab-ecd0-48e6-86e4-9f051bae1c4c details: https://console.cloud.google.com/bigquery?project=bigframes-testing&j=bq:US:56898eab-ecd0-48e6-86e4-9f051bae1c4c&page=queryresults
    +Load job beb92731-f10c-480b-87cb-16d24ea370ad is RUNNING. 
    +https://console.cloud.google.com/bigquery?project=beb92731-f10c-480b-87cb-16d24ea370ad&j=bq:US:beb92731-f10c-480b-87cb-16d24ea370ad&page=queryresults
    +Load job beb92731-f10c-480b-87cb-16d24ea370ad is DONE. 
    +https://console.cloud.google.com/bigquery?project=beb92731-f10c-480b-87cb-16d24ea370ad&j=bq:US:beb92731-f10c-480b-87cb-16d24ea370ad&page=queryresults
              pitcherFirstName pitcherLastName  averagePitchSpeed
    -rowindex
    +rowindex                                                    
     1                Albertin         Chapman          96.514113
     2                 Zachary         Britton          94.591039

/[tmpfs/src/github/python-bigquery-dataframes/bigframes/session/__init__.py:667](https://cs.corp.google.com/piper///depot/google3/tmpfs/src/github/python-bigquery-dataframes/bigframes/session/__init__.py?l=667): DocTestFailure

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. size: m Pull request size is medium.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants