Skip to content

feat: Google Analytics 4 importer#509

Merged
Blaumaus merged 16 commits intomainfrom
feat/ga4-importer
Apr 13, 2026
Merged

feat: Google Analytics 4 importer#509
Blaumaus merged 16 commits intomainfrom
feat/ga4-importer

Conversation

@Blaumaus
Copy link
Copy Markdown
Member

@Blaumaus Blaumaus commented Apr 8, 2026

Changes

If applicable, please describe what changes were made in this pull request.

Community Edition support

  • Your feature is implemented for the Swetrix Community Edition
  • This PR only updates the Cloud (Enterprise) Edition code (e.g. Paddle webhooks, blog, payouts, etc.)

Database migrations

  • Clickhouse / MySQL migrations added for this PR
  • No table schemas changed in this PR

Documentation

  • You have updated the documentation according to your PR
  • This PR did not change any publicly documented endpoints

Summary by CodeRabbit

  • New Features

    • Added Google Analytics 4 (GA4) import: OAuth connect, property selection, and background GA4 imports with import-history monitoring and success/error toasts; provider appears in the import UI with its icon.
  • Documentation

    • Added “Importing from Google Analytics 4” docs covering workflow, supported dimensions, exclusions, limitations, and processing details.
  • Localization

    • Added GA4 UI strings for connect flow, property selection, and status messages.

@Blaumaus Blaumaus self-assigned this Apr 8, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 8, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds GA4 import end-to-end: backend OAuth/token management and GA4 Data API mapper, controller endpoints and job enqueueing; updates processor, mapper interfaces, key derivation, env/package; frontend OAuth callback route, proxy, UI flows, icon, routes, and localization.

Changes

Cohort / File(s) Summary
Configuration & Dependency
backend/.env.example, backend/package.json
Add GOOGLE_GA4_CLIENT_ID and GOOGLE_GA4_CLIENT_SECRET to env example; add @google-analytics/data dependency.
Key Derivation
backend/apps/cloud/src/common/utils.ts
Extend deriveKey purpose union to include 'ga4-token'.
Mapper Interfaces & Registry
backend/apps/cloud/src/data-import/mappers/mapper.interface.ts, backend/apps/cloud/src/data-import/mappers/index.ts
Allow expectedFileExtension and filePath to be null, add optional context param; register 'google-analytics' and Ga4Mapper.
Module & Processor
backend/apps/cloud/src/data-import/data-import.module.ts, backend/apps/cloud/src/data-import/data-import.processor.ts
Register Ga4ImportService; extend DataImportJobData with GA4 fields; treat google-analytics as API-based (no filePath), pass context to mappers, and avoid deleting temp file for API imports.
Controller & DTOs
backend/apps/cloud/src/data-import/data-import.controller.ts, backend/apps/cloud/src/data-import/dto/*
Inject Ga4ImportService; add endpoints for configured/connect/callback/properties/start; add StartGa4ImportDto and re-export; tighten upload validation for non-file-based providers.
GA4 Service
backend/apps/cloud/src/data-import/ga4-import.service.ts
New Ga4ImportService: builds OAuth client, generates auth URL/state, stores state in Redis, handles callback, requires refresh token, encrypts token, lists properties via GA Admin API, and exposes token consumption for imports.
GA4 Mapper
backend/apps/cloud/src/data-import/mappers/ga4.mapper.ts
New Ga4Mapper: paginated GA4 Data API queries for pageviews and events, builds source/UTM lookup, normalizes/truncates fields, derives session IDs, and yields analytics/custom event rows.
Frontend: Server & Proxy
web/app/api/api.server.ts, web/app/hooks/useAuthProxy.ts, web/app/routes/api.auth.ts
Add processGA4ImportTokenServer helper; extend useAuthProxy with processGA4ImportToken; add proxy action handling for processGA4ImportToken.
Frontend: Routing & Callback
web/app/routes/ga4-import-connected.tsx, web/app/utils/routes.ts
Add /ga4-import-connected route that handles server-side token processing with client-side hash fallback and redirects to project settings; register route constant.
Frontend: Project Settings & UI
web/app/pages/Project/Settings/components/DataImportTab.tsx, web/app/routes/projects.settings.$id.tsx, web/app/lib/models/Project.ts
Add google-analytics provider and icon; implement connect/properties/start flows, URL param handling, new action intents, and extend action data.
Frontend: Icons, Loader, Locales
web/app/ui/icons/GoogleAnalytics.tsx, web/app/ui/Loader.tsx, web/public/locales/en.json
Add GA4 SVG icon; make Loader accept optional className; add GA4 localization strings.
Docs
docs/content/docs/data-import.mdx
Document GA4 import flow, supported dimensions, exclusions, and limitations.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Frontend
    participant Backend
    participant Ga4Service
    participant Redis
    participant GoogleOAuth as Google OAuth
    participant GAAdmin as GA4 Admin API
    participant Queue

    User->>Frontend: Click "Connect GA4"
    Frontend->>Backend: POST /data-import/:projectId/ga4/connect
    Backend->>Ga4Service: generateConnectURL(uid, pid)
    Ga4Service->>Redis: store(state -> {uid,pid})
    Ga4Service-->>Backend: return auth URL
    Backend-->>Frontend: return ga4AuthUrl
    Frontend->>GoogleOAuth: Redirect user to auth URL
    GoogleOAuth->>Frontend: Redirect callback (code, state)
    Frontend->>Backend: POST /data-import/ga4/callback (code,state)
    Backend->>Ga4Service: handleOAuthCallback(uid, code, state)
    Ga4Service->>Redis: validate and delete state
    Ga4Service->>GoogleOAuth: exchange code -> tokens
    Ga4Service->>Ga4Service: encrypt refresh_token
    Ga4Service->>Redis: store(encryptedRefreshToken keyed by uid:pid)
    Ga4Service-->>Backend: return pid
    Backend-->>Frontend: return pid
    Frontend->>Backend: GET /data-import/:projectId/ga4/properties
    Backend->>Ga4Service: listProperties(uid, pid)
    Ga4Service->>Redis: get encryptedRefreshToken
    Ga4Service->>GoogleOAuth: refresh access_token
    Ga4Service->>GAAdmin: accountSummaries
    GAAdmin-->>Ga4Service: properties list
    Ga4Service-->>Backend: return properties
    Frontend->>Backend: POST /data-import/:projectId/ga4/start (propertyId)
    Backend->>Ga4Service: consumeTokenForImport(uid, pid)
    Ga4Service->>Redis: retrieve & delete encrypted token
    Ga4Service-->>Backend: return encrypted token + client creds
    Backend->>Queue: enqueue process-import job (ga4 fields)
    Queue-->>Backend: ack/enqueued
    Backend-->>Frontend: return success
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Poem

🐰 I hopped to fetch a GA4 key,
I tucked a token safe beneath a tree,
Streams of pages, events unfurled,
I nibble rows and watch them swirl,
Hop—imports bloom, analytics free!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 11.76% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: Google Analytics 4 importer' clearly and concisely describes the main change—adding a GA4 import feature across both backend and frontend.
Description check ✅ Passed The PR description follows the required template structure with all sections present, includes appropriate checkboxes (Cloud-only feature, no schema changes, documentation updated), but lacks detailed explanation of implementation specifics.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/ga4-importer

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (8)
web/app/pages/Project/Settings/components/DataImportTab.tsx (2)

358-364: Redundant conditional branches in handleProviderClick.

Both branches of the conditional perform identical operations. This can be simplified:

♻️ Simplified handler
   const handleProviderClick = (provider: Provider) => {
-    if (provider === 'google-analytics') {
-      setSelectedProvider(provider)
-    } else {
-      setSelectedProvider(provider)
-    }
+    setSelectedProvider(provider)
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@web/app/pages/Project/Settings/components/DataImportTab.tsx` around lines 358
- 364, The conditional in handleProviderClick is redundant because both branches
call setSelectedProvider; simplify by removing the if/else and directly call
setSelectedProvider(provider) inside the handleProviderClick function
(reference: handleProviderClick and Provider type, and setSelectedProvider).

366-375: Consider blocking modal close during GA4 connection.

The modal close handler checks ga4StartFetcher.state but not ga4Connecting. If the user closes the modal while the OAuth URL is being fetched, the redirect effect will still fire. This is benign (redirect proceeds), but for consistency with the upload and start flows, you may want to also check ga4Connecting:

♻️ Block close during connection
   const handleCloseModal = () => {
     if (selectedProvider === 'google-analytics') {
-      if (ga4StartFetcher.state === 'idle') {
+      if (ga4StartFetcher.state === 'idle' && ga4ConnectFetcher.state === 'idle') {
         setSelectedProvider(null)
         resetGa4State()
       }
     } else {
       if (!uploading) setSelectedProvider(null)
     }
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@web/app/pages/Project/Settings/components/DataImportTab.tsx` around lines 366
- 375, handleCloseModal currently allows closing when selectedProvider ===
'google-analytics' based only on ga4StartFetcher.state, but it should also block
when ga4Connecting is true; update the conditional in handleCloseModal to check
both ga4StartFetcher.state === 'idle' AND ga4Connecting === false before calling
setSelectedProvider(null) and resetGa4State(), mirroring the upload flow that
uses the uploading flag to prevent close during active operations.
backend/apps/cloud/src/data-import/ga4-import.service.ts (3)

160-167: Axios request lacks timeout.

The API call to Google Analytics Admin could hang indefinitely if Google's servers are unresponsive. Consider adding a timeout:

♻️ Add request timeout
       const { data } = await axios.get(
         'https://analyticsadmin.googleapis.com/v1beta/accountSummaries',
         {
           headers: { Authorization: `Bearer ${token}` },
           params: { pageSize: 200 },
+          timeout: 30_000, // 30 seconds
         },
       )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/apps/cloud/src/data-import/ga4-import.service.ts` around lines 160 -
167, The axios.get call inside the try block of ga4-import.service.ts (the call
to 'https://analyticsadmin.googleapis.com/v1beta/accountSummaries') does not set
a timeout and can hang; update the axios request config in that call (or create
an axios instance used by the method) to include a sensible timeout (e.g.,
10_000 ms) so the request fails fast and is catchable by the existing error
handling in the surrounding function.

63-71: OAuth state lacks cryptographic randomness.

The state value ${pid}:${uid}:${Date.now()} is somewhat predictable. While the Redis lookup with UID matching provides protection, adding a random component would follow OAuth best practices more closely.

♻️ Add random component to state
+import { randomBytes } from 'crypto'

-    const state = `${pid}:${uid}:${Date.now()}`
+    const nonce = randomBytes(16).toString('hex')
+    const state = `${pid}:${uid}:${Date.now()}:${nonce}`
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/apps/cloud/src/data-import/ga4-import.service.ts` around lines 63 -
71, The OAuth state string `${pid}:${uid}:${Date.now()}` is predictable; update
the state generation in ga4-import.service.ts (near getOAuthClient(),
REDIS_STATE_PREFIX, REDIS_TOKEN_TTL, redis.set) to include a cryptographically
secure random component (e.g., append a hex/base64 value from Node's
crypto.randomBytes(16)), store the full state in Redis as you do now, and ensure
any downstream state verification logic decodes/parses and validates the same
extended state format (including the random nonce) when handling the callback.

104-112: Token exchange error handling could be more specific.

If oauth2Client.getToken(code) throws (e.g., invalid/expired code), the error propagates as-is. Consider wrapping with a user-friendly message:

♻️ Wrap token exchange errors
-    const { tokens } = await oauth2Client.getToken(code)
+    let tokens
+    try {
+      const response = await oauth2Client.getToken(code)
+      tokens = response.tokens
+    } catch (error) {
+      throw new BadRequestException(
+        'Failed to exchange authorization code. Please try connecting again.',
+      )
+    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/apps/cloud/src/data-import/ga4-import.service.ts` around lines 104 -
112, Wrap the call to oauth2Client.getToken(code) (in the GA4 import flow using
getOAuthClient()) in a try/catch so token-exchange errors (invalid/expired code)
are caught and rethrown as a BadRequestException with a clear user-facing
message; include or log the original error details for debugging while ensuring
the thrown BadRequestException message is user-friendly (e.g., "Failed to
exchange authorization code for tokens. Please try reconnecting.") and preserve
checking tokens.refresh_token afterward.
backend/apps/cloud/src/data-import/data-import.controller.ts (1)

75-86: Consider using a DTO for callback body validation.

The inline type { code: string; state: string } with manual validation could be replaced with a proper DTO using class-validator decorators for consistency with other endpoints like StartGa4ImportDto:

♻️ Create Ga4CallbackDto
// In dto/ga4-callback.dto.ts
import { IsNotEmpty, IsString } from 'class-validator'

export class Ga4CallbackDto {
  `@IsString`()
  `@IsNotEmpty`()
  code: string

  `@IsString`()
  `@IsNotEmpty`()
  state: string
}

Then update the controller:

-  `@Body`() body: { code: string; state: string },
+  `@Body`() body: Ga4CallbackDto,
 ) {
-  if (!body.code || !body.state) {
-    throw new BadRequestException('Missing code or state parameter')
-  }
   return this.ga4ImportService.handleOAuthCallback(uid, body.code, body.state)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/apps/cloud/src/data-import/data-import.controller.ts` around lines 75
- 86, Replace the inline body type in the ga4Callback method with a validated
DTO: create dto/Ga4CallbackDto containing code and state properties decorated
with `@IsString`() and `@IsNotEmpty`() (class-validator), import that DTO into
data-import.controller and change the method signature to use `@Body`() payload:
Ga4CallbackDto; remove the manual if-check for body.code/body.state so Nest's
validation handles missing/invalid fields (ensure ValidationPipe is enabled
globally or for this route).
web/app/routes/ga4-import-connected.tsx (1)

67-74: Error fallback shows generic message.

When useClientFallback is false but we reach the error StatusPage, data.error would be undefined based on the loader logic (loader returns useClientFallback: true for all error cases). This branch may be unreachable in practice.

Consider whether this branch is needed, or adjust the description:

-      description={data.error || 'Unknown error'}
+      description='An unexpected error occurred. Please try again.'
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@web/app/routes/ga4-import-connected.tsx` around lines 67 - 74, The StatusPage
error branch currently shows description={data.error || 'Unknown error'}, but
loader guarantees useClientFallback is true for errors so data.error will be
undefined; either remove this unreachable branch or make the description
deterministic: update the StatusPage render in ga4-import-connected.tsx to use a
clear fixed message (e.g., 'Failed to connect Google Analytics. Please try again
or contact support') instead of data.error, or adjust the loader logic that sets
useClientFallback to allow passing a real error string; reference StatusPage,
data.error and useClientFallback to locate and change the fallback behavior
accordingly.
backend/apps/cloud/src/data-import/mappers/ga4.mapper.ts (1)

340-342: Yielding same object reference multiple times.

The loop yields the same data object reference count times. This works correctly because the processor's batch insertion doesn't mutate the objects, but it's worth noting this relies on that invariant. If future changes modified rows before insertion, all duplicates would be affected.

For defensive purposes, consider shallow-copying if the pattern feels fragile:

for (let i = 0; i < count; i++) {
  yield { table: 'analytics', data: { ...data } }
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/apps/cloud/src/data-import/mappers/ga4.mapper.ts` around lines 340 -
342, The loop in ga4.mapper.ts yields the same `data` object reference `count`
times (the for loop that does `yield { table: 'analytics', data }`), which is
fragile if downstream code ever mutates rows; fix it by yielding a shallow copy
each iteration (e.g., spread `data` into a new object) so each yielded row is
independent while preserving current behavior; update the yield site inside that
for-loop to use a copied object.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/.env.example`:
- Around line 63-65: Add a startup fail-fast check that validates both
GOOGLE_GA4_CLIENT_ID and GOOGLE_GA4_CLIENT_SECRET are set (not empty) and aborts
startup with a clear error if missing, rather than deferring to request-time;
implement this by adding a validation step in the application bootstrap/config
initialization that inspects process.env.GOOGLE_GA4_CLIENT_ID and
process.env.GOOGLE_GA4_CLIENT_SECRET and throws/logs and exits when either is
absent, and keep getCredentials() in ga4-import.service.ts as a consumer but no
longer responsible for first-party validation so missing vars are caught at
startup.

In `@backend/apps/cloud/src/data-import/data-import.processor.ts`:
- Around line 113-120: The context may include undefined
ga4ClientId/ga4ClientSecret which the GA4 mapper (ga4.mapper.ts) passes to
OAuth2Client; add validation where context is built (the isApiBased branch that
sets ga4ClientId/ga4ClientSecret) to ensure both ga4ClientId and ga4ClientSecret
are non-empty strings before including them in context—if invalid, either set
context to undefined or throw/log and skip the GA4/api-based import; update the
validation check near the context construction in data-import.processor.ts so
the GA4 mapper never receives undefined credentials.

In `@backend/apps/cloud/src/data-import/dto/start-ga4-import.dto.ts`:
- Around line 10-11: The propertyId field currently allows any string; enforce
digits-only validation on propertyId in the StartGa4Import DTO by replacing the
loose `@IsString`() with a numeric-string validator (e.g., use class-validator's
`@IsNumberString`() or `@Matches`(/^\d+$/)) and add `@IsNotEmpty`() if needed; ensure
you import the chosen decorator and apply it to the propertyId field in the
StartGa4Import DTO class so runtime validation matches the documented numeric
contract.

In `@backend/apps/cloud/src/data-import/mappers/ga4.mapper.ts`:
- Around line 122-134: The code currently validates only
context.encryptedRefreshToken and context.ga4PropertyId but not
context.ga4ClientId or context.ga4ClientSecret, causing OAuth2Client to receive
"undefined"; update the guard to also check context.ga4ClientId and
context.ga4ClientSecret and throw the same ImportError when any are missing,
then safely assign refreshToken = decryptToken(context.encryptedRefreshToken as
string) and clientId/clientSecret/propertyId as strings after the validation so
OAuth2Client is always constructed with real string credentials (referencing
context, ga4ClientId, ga4ClientSecret, ga4PropertyId, encryptedRefreshToken,
decryptToken, OAuth2Client, ImportError).

In `@docs/content/docs/data-import.mdx`:
- Around line 22-24: The two GA4 statements contradict each other: the top
callout ("GA4 imports use aggregated data... language, region, city, browser/OS
versions, and UTM campaign/term/content are not available") conflicts with the
later section that says region/city and UTM fields are imported. Reconcile by
updating the callout or the later section to a single, unambiguous contract:
either state those fields are unavailable from GA4 aggregated API or explain the
exact conditional availability (e.g., "region/city and UTM fields are only
imported when the GA4 API returns non-aggregated data or when property-level
settings permit; otherwise they are omitted"), and make the same change to the
corresponding paragraph that currently claims those fields are imported so both
spots (the initial GA4 limitations paragraph and the later import-fields
paragraph) match.

---

Nitpick comments:
In `@backend/apps/cloud/src/data-import/data-import.controller.ts`:
- Around line 75-86: Replace the inline body type in the ga4Callback method with
a validated DTO: create dto/Ga4CallbackDto containing code and state properties
decorated with `@IsString`() and `@IsNotEmpty`() (class-validator), import that DTO
into data-import.controller and change the method signature to use `@Body`()
payload: Ga4CallbackDto; remove the manual if-check for body.code/body.state so
Nest's validation handles missing/invalid fields (ensure ValidationPipe is
enabled globally or for this route).

In `@backend/apps/cloud/src/data-import/ga4-import.service.ts`:
- Around line 160-167: The axios.get call inside the try block of
ga4-import.service.ts (the call to
'https://analyticsadmin.googleapis.com/v1beta/accountSummaries') does not set a
timeout and can hang; update the axios request config in that call (or create an
axios instance used by the method) to include a sensible timeout (e.g., 10_000
ms) so the request fails fast and is catchable by the existing error handling in
the surrounding function.
- Around line 63-71: The OAuth state string `${pid}:${uid}:${Date.now()}` is
predictable; update the state generation in ga4-import.service.ts (near
getOAuthClient(), REDIS_STATE_PREFIX, REDIS_TOKEN_TTL, redis.set) to include a
cryptographically secure random component (e.g., append a hex/base64 value from
Node's crypto.randomBytes(16)), store the full state in Redis as you do now, and
ensure any downstream state verification logic decodes/parses and validates the
same extended state format (including the random nonce) when handling the
callback.
- Around line 104-112: Wrap the call to oauth2Client.getToken(code) (in the GA4
import flow using getOAuthClient()) in a try/catch so token-exchange errors
(invalid/expired code) are caught and rethrown as a BadRequestException with a
clear user-facing message; include or log the original error details for
debugging while ensuring the thrown BadRequestException message is user-friendly
(e.g., "Failed to exchange authorization code for tokens. Please try
reconnecting.") and preserve checking tokens.refresh_token afterward.

In `@backend/apps/cloud/src/data-import/mappers/ga4.mapper.ts`:
- Around line 340-342: The loop in ga4.mapper.ts yields the same `data` object
reference `count` times (the for loop that does `yield { table: 'analytics',
data }`), which is fragile if downstream code ever mutates rows; fix it by
yielding a shallow copy each iteration (e.g., spread `data` into a new object)
so each yielded row is independent while preserving current behavior; update the
yield site inside that for-loop to use a copied object.

In `@web/app/pages/Project/Settings/components/DataImportTab.tsx`:
- Around line 358-364: The conditional in handleProviderClick is redundant
because both branches call setSelectedProvider; simplify by removing the if/else
and directly call setSelectedProvider(provider) inside the handleProviderClick
function (reference: handleProviderClick and Provider type, and
setSelectedProvider).
- Around line 366-375: handleCloseModal currently allows closing when
selectedProvider === 'google-analytics' based only on ga4StartFetcher.state, but
it should also block when ga4Connecting is true; update the conditional in
handleCloseModal to check both ga4StartFetcher.state === 'idle' AND
ga4Connecting === false before calling setSelectedProvider(null) and
resetGa4State(), mirroring the upload flow that uses the uploading flag to
prevent close during active operations.

In `@web/app/routes/ga4-import-connected.tsx`:
- Around line 67-74: The StatusPage error branch currently shows
description={data.error || 'Unknown error'}, but loader guarantees
useClientFallback is true for errors so data.error will be undefined; either
remove this unreachable branch or make the description deterministic: update the
StatusPage render in ga4-import-connected.tsx to use a clear fixed message
(e.g., 'Failed to connect Google Analytics. Please try again or contact
support') instead of data.error, or adjust the loader logic that sets
useClientFallback to allow passing a real error string; reference StatusPage,
data.error and useClientFallback to locate and change the fallback behavior
accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3e97b72d-0f7a-4320-9c7f-d2be6d83cca9

📥 Commits

Reviewing files that changed from the base of the PR and between b826949 and da37588.

⛔ Files ignored due to path filters (1)
  • backend/package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (24)
  • backend/.env.example
  • backend/apps/cloud/src/common/utils.ts
  • backend/apps/cloud/src/data-import/data-import.controller.ts
  • backend/apps/cloud/src/data-import/data-import.module.ts
  • backend/apps/cloud/src/data-import/data-import.processor.ts
  • backend/apps/cloud/src/data-import/dto/index.ts
  • backend/apps/cloud/src/data-import/dto/start-ga4-import.dto.ts
  • backend/apps/cloud/src/data-import/ga4-import.service.ts
  • backend/apps/cloud/src/data-import/mappers/ga4.mapper.ts
  • backend/apps/cloud/src/data-import/mappers/index.ts
  • backend/apps/cloud/src/data-import/mappers/mapper.interface.ts
  • backend/package.json
  • docs/content/docs/data-import.mdx
  • web/app/api/api.server.ts
  • web/app/hooks/useAuthProxy.ts
  • web/app/lib/models/Project.ts
  • web/app/pages/Project/Settings/components/DataImportTab.tsx
  • web/app/routes/api.auth.ts
  • web/app/routes/ga4-import-connected.tsx
  • web/app/routes/projects.settings.$id.tsx
  • web/app/ui/Loader.tsx
  • web/app/ui/icons/GoogleAnalytics.tsx
  • web/app/utils/routes.ts
  • web/public/locales/en.json

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/apps/cloud/src/data-import/ga4-import.service.ts`:
- Around line 26-29: The exported type Ga4Property is reported as unused; either
remove the export keyword from the Ga4Property declaration if it's only consumed
internally by ga4-import.service.ts, or update the consumer (the
controller/processor that calls listProperties) to import and use Ga4Property in
its function signatures/returns so the type is referenced (e.g., annotate the
listProperties return type and any variables or DTOs with Ga4Property). Ensure
you update the Ga4Property declaration or the caller's imports/annotations
accordingly so the export is either removed or actually used.
- Around line 113-116: The GA4 refresh_token encryption uses
CryptoJS.Rabbit.encrypt (in ga4-import.service.ts) which provides
confidentiality only; replace it with an authenticated encryption scheme or add
an HMAC like the captcha flow. Use the existing deriveKey() to derive an
AES-256-GCM key (preferred) or derive separate keys for AES-GCM and HMAC, then
encrypt the tokens with AES-256-GCM and store the ciphertext plus its auth tag
(or, if choosing HMAC, compute HMAC-SHA256 over the Rabbit ciphertext and store
the signature too, mirroring signCaptchaCiphertext in captcha.service.ts);
ensure you update the decrypt path to verify authentication (AES-GCM tag or
HMAC) before returning the plaintext.
- Around line 214-236: In getAndDeleteToken, replace the separate redis.get(key)
and redis.del(key) calls with a single atomic GETDEL call (e.g., use
redis.getdel(key)) so the refresh token cannot be double-consumed by concurrent
requests; remove the subsequent await redis.del(key) and keep returning {
encryptedRefreshToken: encrypted, clientId, clientSecret } as before using this
atomic read-and-delete behavior in the getAndDeleteToken method.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ead58633-efd0-4747-8757-1daf58665439

📥 Commits

Reviewing files that changed from the base of the PR and between 53c22e6 and f5f1aca.

📒 Files selected for processing (4)
  • backend/apps/cloud/src/data-import/ga4-import.service.ts
  • web/app/api/api.server.ts
  • web/app/lib/models/Project.ts
  • web/public/locales/en.json
✅ Files skipped from review due to trivial changes (2)
  • web/public/locales/en.json
  • web/app/lib/models/Project.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • web/app/api/api.server.ts

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
docs/content/docs/data-import.mdx (1)

22-24: ⚠️ Potential issue | 🟠 Major

Resolve GA4 docs contradiction in capability statement.

This says region/city and UTM campaign/term/content are unavailable, which conflicts with Line 69 and Line 70 and with the GA4 mapper behavior (backend/apps/cloud/src/data-import/mappers/ga4.mapper.ts Lines 162-210 and 240-310). Please make this contract consistent (e.g., “imported when present; may be empty/missing for some rows/properties”).

✏️ Suggested doc fix
-  GA4 imports use aggregated data from Google's API. Some dimensions like language, region, city,
-  browser/OS versions, and UTM campaign/term/content are not available due to API limitations. Data
+  GA4 imports use aggregated data from Google's API. Some dimensions (like language/locale and
+  browser/OS versions) are not available. Region/city and traffic source/UTM fields are imported
+  when present in GA4 reports, and may be empty for some rows. Data
   is imported with hourly precision.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/content/docs/data-import.mdx` around lines 22 - 24, Update the
contradictory GA4 capability sentence to match the GA4 mapper behavior: change
the statement that "region/city and UTM campaign/term/content are not available"
to a neutral, accurate contract such as "these fields are imported when present;
they may be empty or missing for some rows/properties," and keep the note about
hourly precision; reference the GA4 mapper logic (ga4.mapper) that conditionally
maps region, city, browser/OS versions, and utm_campaign/utm_term/utm_content so
the docs align with the mapper behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@docs/content/docs/data-import.mdx`:
- Around line 22-24: Update the contradictory GA4 capability sentence to match
the GA4 mapper behavior: change the statement that "region/city and UTM
campaign/term/content are not available" to a neutral, accurate contract such as
"these fields are imported when present; they may be empty or missing for some
rows/properties," and keep the note about hourly precision; reference the GA4
mapper logic (ga4.mapper) that conditionally maps region, city, browser/OS
versions, and utm_campaign/utm_term/utm_content so the docs align with the
mapper behavior.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7478862d-6d86-4535-aa11-964ac6abfa0f

📥 Commits

Reviewing files that changed from the base of the PR and between f5f1aca and b1e94c3.

📒 Files selected for processing (2)
  • docs/content/docs/data-import.mdx
  • web/app/api/api.server.ts

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (4)
backend/apps/cloud/src/data-import/mappers/ga4.mapper.ts (3)

345-347: Same data object yielded multiple times per aggregated row.

The same data reference is yielded count times. This works correctly since the processor immediately pushes to a batch array without modification, but it's a subtle pattern that could break if consumer code changes.

Consider documenting this behavior or spreading the object for safety:

♻️ Optional: defensive copy
         for (let i = 0; i < count; i++) {
-          yield { table: 'analytics', data }
+          yield { table: 'analytics', data: { ...data } }
         }

Note: This trades off memory/allocation overhead for safety. Given the current consumer behavior, the existing code is correct.

Also applies to: 453-455

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/apps/cloud/src/data-import/mappers/ga4.mapper.ts` around lines 345 -
347, The loop yields the same data object reference multiple times (for (let i =
0; i < count; i++) yield { table: 'analytics', data }), which can lead to subtle
bugs if consumers mutate items; change the yield to produce a shallow defensive
copy (e.g., yield { table: 'analytics', data: { ...data } }) so each yielded
record is independent, and apply the same change to the analogous block
referenced at lines 453-455.

141-143: Type cast on authClient bypasses type safety.

The as any cast works but suppresses TypeScript's type checking. If the BetaAnalyticsDataClient constructor signature changes in a future library version, this could cause runtime errors without compile-time warnings.

Consider adding a brief comment explaining why the cast is necessary (type incompatibility between OAuth2Client and the expected auth interface).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/apps/cloud/src/data-import/mappers/ga4.mapper.ts` around lines 141 -
143, The code currently uses an unsafe cast for authClient when instantiating
BetaAnalyticsDataClient (const client = new BetaAnalyticsDataClient({
authClient: oauth2Client as any })) which bypasses TypeScript checks; update
this by keeping the cast only if necessary but add a concise explanatory comment
immediately above or inline with the constructor call referencing the type
incompatibility between OAuth2Client and the library's expected auth interface
(mention oauth2Client and BetaAnalyticsDataClient), why the cast is required,
and add a TODO to remove the cast when the upstream types or a proper
adapter/wrapper is available; alternatively, if feasible create or use a thin
adapter that implements the expected auth interface instead of using as any.

166-239: Memory usage in buildSourceLookup is bounded by unique keys, not row count.

The comment notes memory stays bounded by keeping only the highest-traffic source per key. However, for high-traffic properties with millions of unique (dateHour|host|pagePath) combinations per month, the Map could still grow significantly.

This is acceptable for typical use cases, but consider adding a comment noting the expected scale constraints or implementing a size limit with LRU eviction for extreme cases.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/apps/cloud/src/data-import/mappers/ga4.mapper.ts` around lines 166 -
239, The buildSourceLookup method currently stores one entry per unique
(dateHour|hostName|pagePath) key in the lookup Map which can still grow very
large for high-traffic properties; either add a clear comment on the expected
scale/constraints and memory assumptions near the buildSourceLookup declaration
(mentioning PAGE_SIZE and monthly chunking) or implement a bounded map (e.g.,
size cap with simple LRU or sampling eviction) to prevent O(`#uniqueKeys`)
growth—update references to lookup, buildSourceLookup, and any consumer logic to
tolerate evictions if you choose the bounded-map approach.
backend/apps/cloud/src/data-import/data-import.processor.ts (1)

113-120: Consider consolidating all GA4 validation before building context.

Currently, encryptedRefreshToken and ga4PropertyId are not validated in the processor—only the mapper validates them. While safe (the mapper throws ImportError), validating all four fields together in the processor would provide a consistent user experience with the same error handling path.

♻️ Optional consolidation
-      const context = isApiBased
-        ? {
-            ga4PropertyId,
-            encryptedRefreshToken,
-            ga4ClientId,
-            ga4ClientSecret,
-          }
-        : undefined
-
-      if (isApiBased && (!ga4ClientId || !ga4ClientSecret)) {
+      if (
+        isApiBased &&
+        (!ga4ClientId || !ga4ClientSecret || !ga4PropertyId || !encryptedRefreshToken)
+      ) {
         await this.dataImportService.markFailed(
           id,
           'GA4 import requires valid OAuth client credentials. Please reconnect your Google account.',
         )
         return
       }
+
+      const context = isApiBased
+        ? {
+            ga4PropertyId,
+            encryptedRefreshToken,
+            ga4ClientId,
+            ga4ClientSecret,
+          }
+        : undefined

Also applies to: 122-128

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/apps/cloud/src/data-import/data-import.processor.ts` around lines 113
- 120, Validate all GA4 fields in the processor before constructing the context:
check encryptedRefreshToken and ga4PropertyId together with ga4ClientId and
ga4ClientSecret when isApiBased is true, and throw the same ImportError used by
the mapper on any missing/invalid field; then build the context object using the
already-validated encryptedRefreshToken, ga4PropertyId, ga4ClientId, and
ga4ClientSecret (replace the current inline construction of context). Ensure the
validation logic is applied in both places where context is built (the block
creating context at the first occurrence and the similar block at lines
122-128).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@backend/apps/cloud/src/data-import/data-import.processor.ts`:
- Around line 113-120: Validate all GA4 fields in the processor before
constructing the context: check encryptedRefreshToken and ga4PropertyId together
with ga4ClientId and ga4ClientSecret when isApiBased is true, and throw the same
ImportError used by the mapper on any missing/invalid field; then build the
context object using the already-validated encryptedRefreshToken, ga4PropertyId,
ga4ClientId, and ga4ClientSecret (replace the current inline construction of
context). Ensure the validation logic is applied in both places where context is
built (the block creating context at the first occurrence and the similar block
at lines 122-128).

In `@backend/apps/cloud/src/data-import/mappers/ga4.mapper.ts`:
- Around line 345-347: The loop yields the same data object reference multiple
times (for (let i = 0; i < count; i++) yield { table: 'analytics', data }),
which can lead to subtle bugs if consumers mutate items; change the yield to
produce a shallow defensive copy (e.g., yield { table: 'analytics', data: {
...data } }) so each yielded record is independent, and apply the same change to
the analogous block referenced at lines 453-455.
- Around line 141-143: The code currently uses an unsafe cast for authClient
when instantiating BetaAnalyticsDataClient (const client = new
BetaAnalyticsDataClient({ authClient: oauth2Client as any })) which bypasses
TypeScript checks; update this by keeping the cast only if necessary but add a
concise explanatory comment immediately above or inline with the constructor
call referencing the type incompatibility between OAuth2Client and the library's
expected auth interface (mention oauth2Client and BetaAnalyticsDataClient), why
the cast is required, and add a TODO to remove the cast when the upstream types
or a proper adapter/wrapper is available; alternatively, if feasible create or
use a thin adapter that implements the expected auth interface instead of using
as any.
- Around line 166-239: The buildSourceLookup method currently stores one entry
per unique (dateHour|hostName|pagePath) key in the lookup Map which can still
grow very large for high-traffic properties; either add a clear comment on the
expected scale/constraints and memory assumptions near the buildSourceLookup
declaration (mentioning PAGE_SIZE and monthly chunking) or implement a bounded
map (e.g., size cap with simple LRU or sampling eviction) to prevent
O(`#uniqueKeys`) growth—update references to lookup, buildSourceLookup, and any
consumer logic to tolerate evictions if you choose the bounded-map approach.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 97780aae-6f54-4c4f-8268-be2285f9f294

📥 Commits

Reviewing files that changed from the base of the PR and between b1e94c3 and 882e5cb.

📒 Files selected for processing (4)
  • backend/apps/cloud/src/data-import/data-import.processor.ts
  • backend/apps/cloud/src/data-import/dto/start-ga4-import.dto.ts
  • backend/apps/cloud/src/data-import/mappers/ga4.mapper.ts
  • docs/content/docs/data-import.mdx
✅ Files skipped from review due to trivial changes (1)
  • backend/apps/cloud/src/data-import/dto/start-ga4-import.dto.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • docs/content/docs/data-import.mdx

@Blaumaus Blaumaus merged commit 252fead into main Apr 13, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant