[ci-status skill] Enumerate all test runs, not just failed-job runs#11326
Open
simonrozsival wants to merge 1 commit into
Open
[ci-status skill] Enumerate all test runs, not just failed-job runs#11326simonrozsival wants to merge 1 commit into
simonrozsival wants to merge 1 commit into
Conversation
The ci-status skill was missing real test failures in two scenarios: 1. `az devops invoke --area test --resource runs` incorrectly routes to `/test/Runs/Statistics` on the `devdiv` org and returns 404, so the skill silently skipped test enumeration entirely. 2. The filter `runStatistics[?outcome=='Failed']` only matched runs with at least one `Failed` result. It missed runs where the test APK process crashed (recorded as `unanalyzedTests > 0` / `NotExecuted`), and the "only iterate over failed timeline jobs" pattern missed runs published under `succeededWithIssues` jobs. Together these produced false-green reports: GitHub showed the check as RED, the skill found nothing actionable in the timeline, and reported CI as essentially green. This change: * Replaces the broken `az devops invoke` call with a direct REST call (`curl` + AAD bearer token) that works on both `dev.azure.com/dnceng-public` and `devdiv.visualstudio.com`. * Widens the failing-run filter to `totalTests > passedTests` so it catches Failed, NotExecuted, Inconclusive, and unanalyzed buckets. * Adds an explicit `NotExecuted` follow-up query for crashed instrumentation runs (e.g. CoreCLRTrimmable SIGSEGV cases). * Tightens the "report green and stop" gate so it only fires when every test run has `passedTests == totalTests` AND no required GitHub check is RED — preventing the false-green path. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Updates the ci-status skill documentation to ensure CI investigations enumerate all Azure DevOps test runs (including crashed/unanalyzed runs and failures hidden under succeededWithIssues jobs), preventing false-green reports when GitHub checks are red.
Changes:
- Replaces the
az devops invoke --area test --resource runsguidance with direct Azure DevOps REST APIcurlexamples using an AAD access token. - Broadens the “failing run” detection heuristic from
outcome == FailedtototalTests > passedTests, and adds a follow-up query forNotExecutedresults whenunanalyzedTests > 0. - Tightens the “CI is green” stop condition to require clean timeline failures, fully passing test runs, and all required GitHub checks passing.
Comment on lines
+166
to
+173
| > ⚠️ Do **not** use `az devops invoke --area test --resource runs` — on `devdiv` it incorrectly routes to `/test/Runs/Statistics` and returns 404. Use the REST API directly with an AAD bearer token (works on both `devdiv` and `dnceng-public`): | ||
|
|
||
| ```bash | ||
| az devops invoke --area test --resource runs \ | ||
| --route-parameters project=$PROJECT \ | ||
| --org $ORG_URL \ | ||
| --query-parameters "buildUri=vstfs:///Build/Build/$BUILD_ID" \ | ||
| --query "value[?runStatistics[?outcome=='Failed']] | [].{id:id, name:name, totalTests:totalTests, state:state, stats:runStatistics}" \ | ||
| --output json 2>&1 | ||
| # 499b84ac-1321-427f-aa17-267ca6975798 is the well-known AAD resource ID for Azure DevOps. | ||
| TOKEN=$(az account get-access-token --resource 499b84ac-1321-427f-aa17-267ca6975798 --query accessToken -o tsv) | ||
|
|
||
| curl -sL -u ":$TOKEN" \ | ||
| "$ORG_URL/$PROJECT/_apis/test/runs?buildUri=vstfs:///Build/Build/$BUILD_ID&api-version=7.0" \ |
Comment on lines
+172
to
176
| curl -sL -u ":$TOKEN" \ | ||
| "$ORG_URL/$PROJECT/_apis/test/runs?buildUri=vstfs:///Build/Build/$BUILD_ID&api-version=7.0" \ | ||
| | jq '[.value[] | {id, name, totalTests, passedTests, unanalyzedTests, state} | ||
| | select((.totalTests // 0) > (.passedTests // 0))]' | ||
| ``` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
While checking CI status on PR #11275 with the
ci-statusskill, two failing tests inMicrosoft.Android.Sdk.TrimmableTypeMap.Testsand a crashedMono.Android.NET_Tests-CoreCLRTrimmableinstrumentation run were silently missed even though theXamarin.Android-PRGitHub check was RED. Root cause is two bugs in.github/skills/ci-status/SKILL.md:Bug 1 —
az devops invokeroutes to the wrong endpoint on devdivresolves to
GET /devdiv/_apis/test/Runs/Statisticsand returns 404 (The controller for path '/devdiv/_apis/test/Runs/Statistics' was not found). The skill currently has no fallback, so test enumeration was skipped entirely.Bug 2 — too-narrow "failing run" filter
The filter
runStatistics[?outcome=='Failed']only catches runs that have at least oneFailedresult. It misses:totalTests=1, passedTests=0, unanalyzedTests=1(NotExecutedoutcome).succeededWithIssuesrather thanfailed(e.g. when the wrapper script swallows the non-zero exit).The skill also said "if no failures found anywhere, report CI as green and stop", which created a false-green path when GitHub showed the check as RED but the timeline had no
failedrecords.Fix
az devops invoke --area test --resource runswith a direct REST call using an AAD bearer token. The same pattern works on bothdev.azure.com/dnceng-publicanddevdiv.visualstudio.com.totalTests > passedTestsinstead ofoutcome=='Failed', so Failed/NotExecuted/Inconclusive/unanalyzed all surface.outcomes=NotExecutedfollow-up query for crashed instrumentation runs.passedTests == totalTests, and no required GitHub check RED. If GitHub is RED but the timeline is clean, the skill is now required to enumerate test runs before concluding.Verification
Manually reproduced on
devdivbuild #14066636 (Xamarin.Android-PR for PR #11275). The new REST + widened-filter path correctly surfaces:Generate_InheritedCtor_ReferencesGuardAndActivationCtor,Generate_InheritedJavaInteropCtor_ReferencesActivationCtor)Possible Crash / Release— SIGSEGV inart::JNI::FindClass)The old skill path returned no test failures for the same build.