Background / Problem
In our pipeline, the dim_classes model adds raw class schedules by attaching instructor details using a LEFT JOIN between stg_classes and stg_instructors. We need to make sure that all of the classes are being created with a valid instructor name attached to them.
Why This Test is Needed
This test acts as an early warning system, catching upstream data entry errors in the warehouse before they are sent to the final dashboard.
Models Involved
- Upstream 1: stg_classes
- Upstream 2: stg_instructors
- Downstream Impact: dim_classes and dashboards.
Business Rule Being Validated
Every non-null instructor_id assigned to a scheduled class must correspond to a valid, existing record in the instructors database.
Implementation Plan
- Navigate to the tests/ directory in the dbt project.
- Create a new singular data test file named: integration_instructor_name_valid.sql.
- Write an SQL test that ensures that each class in stg_classes has an instructor name that can be found in stg_instructors, and look for any orphaned instructor IDs.
Definition of Done
- The singular test file is committed to the tests/ directory.
- Running dbt test --select integration_instructor_name_valid.sql. executes successfully and returns 'pass' (0 records found).
-A PR is opened, peer-reviewed, and merged into the main branch.
Example Failure Scenario
If there is a gym class where no instructor has been added or the name/ID is invalid, then the test should bring back one result for each instance. Theoretically this could happen in the real world if a new instructor was hired, but their details were not added to the system prior to their classes being created.
Background / Problem
In our pipeline, the dim_classes model adds raw class schedules by attaching instructor details using a LEFT JOIN between stg_classes and stg_instructors. We need to make sure that all of the classes are being created with a valid instructor name attached to them.
Why This Test is Needed
This test acts as an early warning system, catching upstream data entry errors in the warehouse before they are sent to the final dashboard.
Models Involved
Business Rule Being Validated
Every non-null instructor_id assigned to a scheduled class must correspond to a valid, existing record in the instructors database.
Implementation Plan
Definition of Done
-A PR is opened, peer-reviewed, and merged into the main branch.
Example Failure Scenario
If there is a gym class where no instructor has been added or the name/ID is invalid, then the test should bring back one result for each instance. Theoretically this could happen in the real world if a new instructor was hired, but their details were not added to the system prior to their classes being created.