Add dynamic PR testing with golden reference comparison#2664
Open
rjodinchr wants to merge 2 commits intoKhronosGroup:mainfrom
Open
Add dynamic PR testing with golden reference comparison#2664rjodinchr wants to merge 2 commits intoKhronosGroup:mainfrom
rjodinchr wants to merge 2 commits intoKhronosGroup:mainfrom
Conversation
This commit introduces a new targeted testing capability to the GitHub Actions pipeline, allowing developers to run specific CTS commands on Pull Requests and automatically compare the results against a known baseline. Key additions and changes: * Dynamic Test Triggers: Added a `tests` job to `presubmit.yml` that parses PR descriptions and commit messages for the `[run-test: <command>]` syntax. The CI environment is only provisioned and executed if a trigger is detected. * pocl Test Environment: The workflow sets up an Ubuntu 24.04 runner, installs necessary dependencies (LLVM 20, Vulkan SDK), and builds `pocl` (v7.1), OpenCL-ICD-Loader, and the OpenCL CTS with experimental features enabled. * Automated Result Comparison: Added `ci/compare_results.py`, a Python utility that compares the JSON output of the triggered tests against a predefined golden reference. It flags missing references and categorizes differences as "FIX", "REGRESSION", or "DIFFERENCE". Any differences marks the check as failing. * Golden Baseline: Introduced `ci/pocl/golden.json` to store the expected outcomes (pass, fail, skip) for tests on the `pocl` implementation. It is initially populated with baseline expectations for the `test_computeinfo` command. It will need to be populated by PR using the feature for all the other tests in the future. [run-test: test_computeinfo]
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This commit introduces a new targeted testing capability to the GitHub Actions pipeline, allowing developers to run specific CTS commands on Pull Requests and automatically compare the results against a known baseline.
Key additions and changes:
testsjob topresubmit.ymlthat parses PR descriptions and commit messages for the[run-test: <command>]syntax. The CI environment is only provisioned and executed if a trigger is detected.pocl(v7.1), OpenCL-ICD-Loader, and the OpenCL CTS with experimental features enabled.ci/compare_results.py, a Python utility that compares the JSON output of the triggered tests against a predefined golden reference. It flags missing references and categorizes differences as "FIX", "REGRESSION", or "DIFFERENCE". Any differences marks the check as failing.ci/pocl/golden.jsonto store the expected outcomes (pass, fail, skip) for tests on thepoclimplementation. It is initially populated with baseline expectations for thetest_computeinfocommand. It will need to be populated by PR using the feature for all the other tests in the future.[run-test: test_computeinfo]