Skip to content

[ci-status skill] Enumerate all test runs, not just failed-job runs#11326

Open
simonrozsival wants to merge 1 commit into
mainfrom
fix/ci-status-skill-test-run-enumeration
Open

[ci-status skill] Enumerate all test runs, not just failed-job runs#11326
simonrozsival wants to merge 1 commit into
mainfrom
fix/ci-status-skill-test-run-enumeration

Conversation

@simonrozsival
Copy link
Copy Markdown
Member

While checking CI status on PR #11275 with the ci-status skill, two failing tests in Microsoft.Android.Sdk.TrimmableTypeMap.Tests and a crashed Mono.Android.NET_Tests-CoreCLRTrimmable instrumentation run were silently missed even though the Xamarin.Android-PR GitHub check was RED. Root cause is two bugs in .github/skills/ci-status/SKILL.md:

Bug 1 — az devops invoke routes to the wrong endpoint on devdiv

az devops invoke --area test --resource runs --query-parameters "buildUri=..."

resolves to GET /devdiv/_apis/test/Runs/Statistics and returns 404 (The controller for path '/devdiv/_apis/test/Runs/Statistics' was not found). The skill currently has no fallback, so test enumeration was skipped entirely.

Bug 2 — too-narrow "failing run" filter

The filter runStatistics[?outcome=='Failed'] only catches runs that have at least one Failed result. It misses:

  • Runs where the test app crashed before publishing any results — these show up as totalTests=1, passedTests=0, unanalyzedTests=1 (NotExecuted outcome).
  • Runs whose containing timeline job is succeededWithIssues rather than failed (e.g. when the wrapper script swallows the non-zero exit).

The skill also said "if no failures found anywhere, report CI as green and stop", which created a false-green path when GitHub showed the check as RED but the timeline had no failed records.

Fix

  • Replace az devops invoke --area test --resource runs with a direct REST call using an AAD bearer token. The same pattern works on both dev.azure.com/dnceng-public and devdiv.visualstudio.com.
  • Filter failing runs with totalTests > passedTests instead of outcome=='Failed', so Failed/NotExecuted/Inconclusive/unanalyzed all surface.
  • Add an explicit outcomes=NotExecuted follow-up query for crashed instrumentation runs.
  • Tighten the green-stop gate: must have no failed timeline records, every test run passedTests == totalTests, and no required GitHub check RED. If GitHub is RED but the timeline is clean, the skill is now required to enumerate test runs before concluding.

Verification

Manually reproduced on devdiv build #14066636 (Xamarin.Android-PR for PR #11275). The new REST + widened-filter path correctly surfaces:

Run Failures
Microsoft.Android.Sdk.TrimmableTypeMap.Tests 2 Failed (Generate_InheritedCtor_ReferencesGuardAndActivationCtor, Generate_InheritedJavaInteropCtor_ReferencesActivationCtor)
Mono.Android.NET_Tests-CoreCLRTrimmable 1 NotExecuted (Possible Crash / Release — SIGSEGV in art::JNI::FindClass)

The old skill path returned no test failures for the same build.

The ci-status skill was missing real test failures in two scenarios:

1. `az devops invoke --area test --resource runs` incorrectly routes
   to `/test/Runs/Statistics` on the `devdiv` org and returns 404,
   so the skill silently skipped test enumeration entirely.

2. The filter `runStatistics[?outcome=='Failed']` only matched runs
   with at least one `Failed` result. It missed runs where the test
   APK process crashed (recorded as `unanalyzedTests > 0` /
   `NotExecuted`), and the "only iterate over failed timeline jobs"
   pattern missed runs published under `succeededWithIssues` jobs.

Together these produced false-green reports: GitHub showed the check
as RED, the skill found nothing actionable in the timeline, and
reported CI as essentially green.

This change:

* Replaces the broken `az devops invoke` call with a direct REST
  call (`curl` + AAD bearer token) that works on both
  `dev.azure.com/dnceng-public` and `devdiv.visualstudio.com`.
* Widens the failing-run filter to `totalTests > passedTests` so it
  catches Failed, NotExecuted, Inconclusive, and unanalyzed buckets.
* Adds an explicit `NotExecuted` follow-up query for crashed
  instrumentation runs (e.g. CoreCLRTrimmable SIGSEGV cases).
* Tightens the "report green and stop" gate so it only fires when
  every test run has `passedTests == totalTests` AND no required
  GitHub check is RED — preventing the false-green path.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 12, 2026 05:43
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the ci-status skill documentation to ensure CI investigations enumerate all Azure DevOps test runs (including crashed/unanalyzed runs and failures hidden under succeededWithIssues jobs), preventing false-green reports when GitHub checks are red.

Changes:

  • Replaces the az devops invoke --area test --resource runs guidance with direct Azure DevOps REST API curl examples using an AAD access token.
  • Broadens the “failing run” detection heuristic from outcome == Failed to totalTests > passedTests, and adds a follow-up query for NotExecuted results when unanalyzedTests > 0.
  • Tightens the “CI is green” stop condition to require clean timeline failures, fully passing test runs, and all required GitHub checks passing.

Comment on lines +166 to +173
> ⚠️ Do **not** use `az devops invoke --area test --resource runs` — on `devdiv` it incorrectly routes to `/test/Runs/Statistics` and returns 404. Use the REST API directly with an AAD bearer token (works on both `devdiv` and `dnceng-public`):

```bash
az devops invoke --area test --resource runs \
--route-parameters project=$PROJECT \
--org $ORG_URL \
--query-parameters "buildUri=vstfs:///Build/Build/$BUILD_ID" \
--query "value[?runStatistics[?outcome=='Failed']] | [].{id:id, name:name, totalTests:totalTests, state:state, stats:runStatistics}" \
--output json 2>&1
# 499b84ac-1321-427f-aa17-267ca6975798 is the well-known AAD resource ID for Azure DevOps.
TOKEN=$(az account get-access-token --resource 499b84ac-1321-427f-aa17-267ca6975798 --query accessToken -o tsv)

curl -sL -u ":$TOKEN" \
"$ORG_URL/$PROJECT/_apis/test/runs?buildUri=vstfs:///Build/Build/$BUILD_ID&api-version=7.0" \
Comment on lines +172 to 176
curl -sL -u ":$TOKEN" \
"$ORG_URL/$PROJECT/_apis/test/runs?buildUri=vstfs:///Build/Build/$BUILD_ID&api-version=7.0" \
| jq '[.value[] | {id, name, totalTests, passedTests, unanalyzedTests, state}
| select((.totalTests // 0) > (.passedTests // 0))]'
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants