Data Build Tool : dbt Packages and Macros tutorial

Data Build Tool : dbt Packages and Macros tutorial

 

DBT Packages and Macros

 


1. What are dbt packages and why are they used?
2. How to install dbt packages for data transformation?
3. What is the purpose of macros in dbt?
4. How can macros be utilized for testing in dbt?

 Introduction:

Developers often create and store code in libraries. When working on a shared code base with multiple team members, they can search for and use perfected code for specific use cases. This allows them to leverage existing code to fulfill their requirements or gain insight into developing for similar needs.

 DBT Packages Installation Guide:

 In DBT, these libraries are referred to as packages. DBT's packages are powerful because they provide easy access to solutions for common business use cases. Here are some advantages of using DBT packages:

 1. Transformation: DBT packages make it easy to transform data from consistently structured datasets.
2. Reusable code: DBT macros can perform various functions, such as
·         unioning two relations,
·         pivoting columns,
·         constructing surrogate keys,
·         writing audit keys,
·         creating custom schema tests.

 

Why Use DBT Packages?

DBT packages are standalone and their macros solve specific problems, allowing other people to easily use and benefit from them.

 When should you use project dependencies?

Project dependencies are intended for establishing connections between dbt Mesh and referencing other dbt projects within a dbt Mesh setup.

 When should you use package dependencies?

Package dependencies enable you to incorporate source code from another dbt project into your own. Utilize packages.yml to import dbt packages, such as entire dbt projects, into your main or parent dbt project.

 How do I Add Package to my Project?

Step 1: Add /Create YAML file named package.yml/ dependencies.yml to your project. This file should be at the same level as dbt_project.yml file
Step 2: go to hub.getdbt.com à dbt-labs à select required dbt packages
Here I am using dbt_utils
Select below code and add into package.yml file

packages:

  - package: dbt-labs/dbt_utils

    version: 1.1.1

go to terminal and Run dbt deps to install the package.
If packages are stored into GIT server can be installed using the git syntax like
packages:

  - git: https://github.com/dbt_labs/dbt_utils.git

 reversion: 1.1.1

refer below url to get more information
docs.getdbt.com/docs/build/packages
once packages are installed you can see 

 What is macro

Dbt macros are more powerful feature that enabled the reuse of code in sql models,similar to functions in traditional programming languages.

Macros defines in .sql files

 Below is the basic syntax for defining macros in dbt

 {% macro macro_name(arg1, arg2, ..., argN) %}
     SQL logic here, using the parameters as needed.
 {% endmacro %}

 Another example
{% macro cents_to_dollars(column_name, scale=2) %}
    ({{ column_name }} / 100)::numeric(16, {{ scale }})
{% endmacro %}

 dbt Macro Examples

Here are three examples on how to write dbt Macros -

Writing dbt Macros: Example 1 - Simple Date Macro

First, a simple macro that mimics order payment date formatting.
1.   Create a Macro File: In your dbt project, create a file called date_macro.sql.
2.   Write the Macro:

sql




Run Query in dbt Model:

sql




This macro will return the date 2023-08-02, similar to how you might handle order payment dates.

Example 2 - Conditional Logic Macro

This example illustrates how control structures and loops in SQL work using dbt macros.

1.   Create Macro File: conditional_macro.sql.

2.   Write Macro:

sql




Use Macro in dbt Model:

sql




This will return a status level. You can think of it like a list of payment methods in terms of priority.

Example 3 - Advanced Iterative Macro

This example is more complex. Here we will set a variable to iterate through a query in our dbt model using dbt jinja.

1.   Create Macro File: iterative_macro.sql.

2.   Write Macro with SQL and Jinja:

sql




Use Macro in dbt Model:

sql






This will run the query five times, showing the power of Jinja and macros with dbt.

 Dynamic SQL Generation:

{% macro dynamic_pivot(table, values, columns) %}
    SELECT
    {% for column in columns %}
        SUM(CASE WHEN pivot_column = '{{ column }}' THEN {{ values }} ELSE 0 END) AS {{ column }}_value
    {% if not loop.last %},{% endif %}
    {% endfor %}
    FROM {{ ref(table) }}
    GROUP BY pivot_column
{% endmacro %}


============= STEPS TO FOLLOW INSTALL DBT PACKAGES AND USE MACROS IN YOUR PROJECT TO TEST ============


1.Add packages.yml in your dbt_project
2. go to hub.getdbt.com ---> dbt-labs ---> select required dbt packages
3. add code in yout packages.yml file
packages:
  - package: dbt-labs/dbt_utils
    version: 1.1.1
4. After adding the code, go to terminal run "dbt deps" to install the packages. once the installation is successfully , you can view the packages in the dbt_packages\dbt_utils foder within your dbt_project.




All Macros will be added in the dbt_packages folder. To use Macros,create yml file(e.g employee.yml) and add the following code:






How to use macros 
1.create yml file(employee.yml here i am using)
add code like this
version: 2

models:
  - name: emp_fact
    database: mysql
    schema: analytics
    description: "Employee Table"
    columns:
      - name: sal_status
        description: "Checking Default values"
        tests:
          - dbt_utils.not_accepted_values:
              values: ['AVG','MORE HIGH']



Once the yml file is successfully created, run tests using the dbt test command.





Youtube video dbt packages


Post a Comment

0 Comments