Data Build Tool – SOURCE Tutorial
One of the key features of dbt is the ability to manage your data sources effectively.
Using
Sources:
1. Select from source tables in your model using the {{ source () }} function
2. Test
your assumption about source data
3. Calculate
the fresheness of your source data
Step 2:
Define Your Sources
Create a Source File: Inside your dbt project directory, navigate to your models or sources directory (this could be models/ or a dedicated sources/ directory). Create a new file named sources.yml.
Define the
Source: You need to specify the database and the
schema where your source tables exist. An example definition might look like
this:
Step 3:
Referencing Sources in Models
You can use the source function in your dbt models to reference these source tables. Create a new model, e.g., my_model.sql, in the models directory:
with source_data as (
select * from {{ source('my_source', 'my_table') }}
)
select *
from source_data
This SQL model
selects all records from my_table in my_source.
Step 4: Run Your
dbt Models
Dbt run –select my_model
Step 5: Testing
Your Sources
dbt allows you to test the validity of your sources. You can add tests in your sources.yml file. For example:
version: 2
sources:
- name: my_source
database: my_database
schema: public
tables:
- name: my_table
description: "This is a table containing user data"
columns:
- name: id
tests:
- unique
- not_null
1. Select from source tables in your model using the {{ source () }} function
Create a Source File: Inside your dbt project directory, navigate to your models or sources directory (this could be models/ or a dedicated sources/ directory). Create a new file named sources.yml.
version: 2 |
version: 2 |
|
|
sources: |
sources: |
- name: my_source |
- name: analytics |
database: my_database |
database: mysql |
schema: public |
schema: analytics |
tables: |
tables: |
- name: my_table |
- name: emp |
- name: my_other_table |
- name: dept |
You can use the source function in your dbt models to reference these source tables. Create a new model, e.g., my_model.sql, in the models directory:
with source_data as (
select * from {{ source('my_source', 'my_table') }}
)
from source_data
Dbt run –select my_model
dbt allows you to test the validity of your sources. You can add tests in your sources.yml file. For example:
version: 2
- name: my_source
database: my_database
schema: public
tables:
- name: my_table
description: "This is a table containing user data"
columns:
- name: id
tests:
- unique
- not_null
Step 6: Documenting Your Sources
You can provide descriptions for your sources and models in your YAML files, which can then be used to generate documentation:
You can provide descriptions for your sources and models in your YAML files, which can then be used to generate documentation:
version: 2
sources:
- name: my_source
description: "This source contains important data for analysis"
To generate the documentation, run
Questions:
1. How to run models downstream of one source
Using source:
selector
Dbt run –select source:table+
- name: my_source
description: "This source contains important data for analysis"
dbt docs generate
dbt docs serve
1. How to run models downstream of one source
Dbt run –select source:table+
0 Comments
Thanks for your message.