dbt Testing complete Guide
What is
dbt Testing:
Several common
generic tests include checking for duplicates, ensuring uniqueness in columns,
verifying that there are no null values, testing referential integrity with
parent tables, confirming data freshness based on a defined timestamp, and
ensuring that columns only contain accepted or default values.
The
most common generic tests include:
2. Uniqueness -
Column has no repeating values
3. Not Null –
There are no null values in a column
4. Referential Integrity -
Test that the column values having an existing relationship with a parent table
5. Freshness:
data should be freshness SLA based on a predefined timestamp
6. Accepted (Default) values –
Check that column contain only pre-defined values
Example:
An employee table having empno,
ename, sal, loc
- empno is primary key and should not contain duplicates and
null values
- sal column not contain 0
- loc column is accepted only default values like (INDIA,USA )
There are two
methods for defining data tests in dbt: singular data test and generic data
test..
2. Generic data test
A singular data test involves a
straightforward SQL query that identifies failing rows. This query can be saved
in a .sql file within the test directory or model.
. Sql file name emp_sal_zero.sal
with emp_test as(
select
totalsal from analytics.emp_fact
where
totalsal <=0
) select * from
emp_test
To run an SQL file, use the
following command:
If the
.sql file is in the model directory:
dbt run
–select emp_sal_zero
dbt test
In the model directory, create
a .yml file and specify unique, not null, and accepted values for each column.
You can define any number of columns in the file.
Example
below
version: 2
models:
- name: emp_fact
database:
mysql
schema:
analytics
description:
"Employee Table"
columns:
-
name: empno
description:
"The primary key for this table"
tests:
- unique
- not_null
-
name: sal_status
description:
"Checking Default values"
tests:
- accepted_values:
values:
['lo','me','hg']
-
name: sal_status
description:
"Checking Default values"
tests:
- dbt_utils.not_accepted_values:
values:
['lo','me','hg']
from {{
model }}
where
{{ column_name }} is null
-
name: columnname
description:
"To Test negative value in sal column"
tests:
-
assert_negative_value
The tests to run can be
selected using the --select flag discussed
# run tests for
one_specific_model
dbt test --select
"one_specific_model"
dbt test --select
"some_package.*"
dbt test --select
"test_type:singular"
dbt test --select
"test_type:generic"
dbt test --select
"one_specific_model,test_type:singular"
# run generic tests limited
to one_specific_model
dbt test --select
"one_specific_model,test_type:generic"
0 Comments
Thanks for your message.