Skip to content

fix: OAuth token refresh, infinite recursion, and pre-signed URL auth in OneDrive client#7

Merged
soustruh merged 10 commits intomainfrom
devin/1777018079-improve-error-logging
Apr 30, 2026
Merged

fix: OAuth token refresh, infinite recursion, and pre-signed URL auth in OneDrive client#7
soustruh merged 10 commits intomainfrom
devin/1777018079-improve-error-logging

Conversation

@yustme
Copy link
Copy Markdown
Contributor

@yustme yustme commented Apr 24, 2026

Summary

Fixes three distinct OAuth/auth bugs in the OneDrive client that surfaced together on a long-running job: the access token expired mid-run, every subsequent request 401'd, and the retry logic broke in three different ways depending on the call site.

1. Tenant-specific auth URL for token refresh

Token refresh in _get_request_tokens() used the hardcoded /common endpoint, which single-tenant Enterprise Apps reject with AADSTS50194. Now uses self.auth_url, set per client type:

  • Personal OneDrive → https://login.microsoftonline.com/common
  • OneDrive for Business / SharePoint → https://login.microsoftonline.com/{tenant_id}

(self.authorityself.auth_url rename for consistency with self.base_url.)

2. Infinite recursion on HTTP 401

When the access token expired mid-run, get_request() recursed: 401 → refresh token → retry via recursion → 401 → recurse, until RecursionError. The @backoff decorator on _download_file_from_onedrive_url then retried the whole thing 5 times. Replaced with an iterative 2-attempt loop that refreshes once and surfaces the Microsoft error on the second 401. Added OneDriveTransientException so @backoff only retries genuine transient errors (429/5xx), not auth failures.

3. Bearer header on pre-signed download URLs

@microsoft.graph.downloadUrl is a pre-signed URL with its own tempauth JWT in the query string — it does not need (and on personal OneDrive's microsoftpersonalcontent.com CDN, actively rejects) the Graph API Bearer header. The old code routed downloads through self.get_request(), which always attaches the Bearer token. Now uses self.get_raw(url, is_absolute_path=True, stream=True, ignore_auth=True) — keeps the HttpClient retry session, drops the Bearer header. Business/SharePoint accounts tolerated the extra header which is why telemetry only showed personal accounts failing.

Other improvements

  • Microsoft error code/description now propagated through exception messages (previously lost; never reached Datadog).
  • Token-refresh / per-file / site-resolution logs downgraded from info to debug to cut noise.
  • Per-folder traversal log: Found N items (X files, Y subfolders[, Z unknown]) in '/path' — previously zero visibility into traversal scale.
  • URLs in error logs sanitized to path-only (urlparse(url).path) so tempauth JWTs can never leak even if a future caller routes a pre-signed URL through get_request().
  • Fixed GitHub Actions workflow triggers, branch resolution, and image-tag step.

Testing

  • Client's stack — initially with tag 1.0.9-fix-tenant-endpoint for the tenant-URL fix, re-tested successfully with 1777018079-improve-error-logging-26 covering all three fixes.
  • Our testing project (personal OneDrive config) — recursion + pre-signed URL fixes verified.

Review checklist

  • Tenant-specific token refresh works for single-tenant Enterprise Apps
  • Personal OneDrive (/common endpoint) still works
  • Long runs that cross the access-token expiry no longer recurse / no longer 401 on downloads
  • Log output at info level is readable: client init, folder scan, item counts, download mask

Link to Devin session: https://app.devin.ai/sessions/8b70f85ef2e0408c96eb91f970f5e217
Requested by: @yustme

…sages

Include HTTP status code, error code and error_description from Microsoft
token endpoint response in the OneDriveClientException message so that
error details propagate to Datadog logs via runner-sync-api.

Previously only a generic 'Authentication failed' message was visible,
making it impossible to distinguish between expired refresh tokens,
missing admin consent, conditional access policies, etc.

Also improve error messages in get_site_id_from_url() and
get_document_libraries() to include site_url and response details.

Co-Authored-By: Vojta Tuma <vojta.tuma@keboola.com>
@devin-ai-integration
Copy link
Copy Markdown
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

devin-ai-integration Bot and others added 2 commits April 24, 2026 08:26
The _get_client method was catching OneDriveClientException silently
and raising a new UserException with a hardcoded generic message,
discarding the improved error details from _get_request_tokens().

Now preserves the last OneDriveClientException message and passes it
through to the UserException so Microsoft error codes are visible.

Co-Authored-By: Vojta Tuma <vojta.tuma@keboola.com>
Single-tenant Enterprise Apps (like CSAS KeboolaOneWriter-EGPROD) reject
the /common endpoint with AADSTS50194. Use self.authority which is
already correctly set to the tenant-specific URL for SharePoint and
OneDrive for Business clients.

Co-Authored-By: Vojta Tuma <vojta.tuma@keboola.com>
@linear
Copy link
Copy Markdown

linear Bot commented Apr 24, 2026

Comment thread src/component.py Outdated
Co-Authored-By: Vojta Tuma <vojta.tuma@keboola.com>
@devin-ai-integration devin-ai-integration Bot force-pushed the devin/1777018079-improve-error-logging branch from 828d1e7 to a86713d Compare April 24, 2026 13:14
… self.base_url

Co-Authored-By: Vojta Tuma <vojta.tuma@keboola.com>
@soustruh soustruh marked this pull request as ready for review April 24, 2026 13:26
soustruh
soustruh previously approved these changes Apr 24, 2026
Copy link
Copy Markdown
Contributor

@soustruh soustruh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks Vojta! ദ്ദി(•ᴗ•)

@devin-ai-integration devin-ai-integration Bot changed the title feat: include Microsoft error details in authentication exception messages fix: use tenant-specific auth URL for token refresh instead of /common endpoint Apr 24, 2026
Co-Authored-By: Vojta Tuma <vojta.tuma@keboola.com>
soustruh
soustruh previously approved these changes Apr 28, 2026
Copy link
Copy Markdown
Contributor

@soustruh soustruh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ദ്ദി(•ᴗ•)

Co-Authored-By: Vojta Tuma <vojta.tuma@keboola.com>
@devin-ai-integration devin-ai-integration Bot force-pushed the devin/1777018079-improve-error-logging branch from a864ec4 to c6d701f Compare April 29, 2026 00:52
@soustruh soustruh force-pushed the devin/1777018079-improve-error-logging branch from d60883b to 79503f6 Compare April 29, 2026 10:58
@soustruh soustruh force-pushed the devin/1777018079-improve-error-logging branch from 79503f6 to c8fa5c0 Compare April 29, 2026 11:37
@soustruh soustruh force-pushed the devin/1777018079-improve-error-logging branch from 3036016 to efc2aaf Compare April 29, 2026 12:13
@soustruh soustruh changed the title fix: use tenant-specific auth URL for token refresh instead of /common endpoint fix: OAuth token refresh, infinite recursion, and pre-signed URL auth in OneDrive client Apr 29, 2026
Copy link
Copy Markdown

@matyas-jirat-keboola matyas-jirat-keboola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do the changes, but they are minor, so approved

tags:
- '*' # Skip the workflow on the main branch without tags
- "*" # Skip the workflow on the main branch without tags
workflow_dispatch:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why workflow_dispatch? Can't we just put there the new CI?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

workflow_dispatch was added per @soustruh's review request — he asked to fix the workflow triggers as an interim step before unifying to the new component-ci base. If you'd prefer to switch to the new CI in this PR instead, I can do that — just let me know which base workflow to use.

@soustruh soustruh merged commit c52b977 into main Apr 30, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants