Data Build Tool: Realtime Project Scenario - How to Convert Different Date format to Unique date format

Data Build Tool: Realtime Project Scenario - How to Convert Different Date format to Unique date format

In today’s blog, we’ll explore a real-time scenario involving dbt (data build tool) where we are extracting data from ERP system to CSV file that contains various date columns and processing it to DWH. during analytics found that one of date columns presents dates in multiple formats(DD/MM/YYYY  & MM/DD/YYYY), which complicates data processing.

To tackle this, we'll leverage dbt Seed to load the CSV data into a table. Initially, the data types for the date columns will be set to string in our database. However, our end-user has specified a requirement for the dates to be in a standardized format, namely YYYY-MM-DD a universally accepted date format that simplifies date comparisons and operations.

Join us as we walk through the steps to process this CSV file, standardize the date format, and transform our string representation of dates into the desired format using dbt, ensuring that our final dataset meets user requirements.


CSV File Data:



After processing data look like in Table.




Let's create dbt model to convert different date format to unique format(user expected date format)

DBT Model:

{{ config(materialized='table') }}

with date_convert as
(
    SELECT  
    a.tdate,gdate,id,name,
    right(tdate,4) year1,
    SUBSTRING_INDEX(tdate, '/', 1) S1,
    case when LENGTH(SUBSTRING_INDEX(tdate, '/', 1))=1
    then concat(0,SUBSTRING_INDEX(tdate, '/', 1)) else SUBSTRING_INDEX(tdate, '/', 1) end dtd,
    CASE WHEN  LENGTH(SUBSTRING_INDEX(SUBSTRING_INDEX(tdate, '/', 2),'/',-1))=1
    THEN CONCAT(0,SUBSTRING_INDEX(SUBSTRING_INDEX(tdate, '/', 2),'/',-1))
                 ELSE SUBSTRING_INDEX(SUBSTRING_INDEX(tdate, '/', 2),'/',-1) END DTD1
FROM  {{ ref("date_check") }}  a
)
select TDATE,
STR_TO_DATE(
CASE WHEN DTD<=12 THEN CONCAT(YEAR1,'-',DTD,'-',DTD1)
WHEN DTD>12 THEN CONCAT(YEAR1,'-',DTD1,'-',DTD) ELSE NULL END,'%Y-%m-%D' ) AS parsed_date
 from date_convert

MYSQL:

1. How to write sql to convert different date format to unique format?

SELECT tdate, 

coalesce(date_format(str_to_date(tdate,'D/M/YYYY'),'YYYY-MM-DD'),date_format(str_to_date(tdate,'MM/DD/YYYY'),'YYYY-MM-DD')) DT  ,

COALESCE(DATE_FORMAT(STR_TO_DATE(tdate, '%e/%m/%Y'), '%Y-%m-%d') ,

DATE_FORMAT(STR_TO_DATE(tdate, '%m/%e/%Y'), '%Y-%m-%d')) AS formatted_date1

FROM ANALYTICS.DATE_CHECK;


ORACLE Query:

SELECT 

  TDATE,

  CASE

        WHEN REGEXP_LIKE(TDATE, '^\d{2}/\d{2}/\d{4}$') THEN

        TO_CHAR(TO_DATE(TDATE, 'MM/DD/YYYY'), 'YYYY-MM-DD')

        WHEN REGEXP_LIKE(TDATE, '^\d{1}/\d{1}/\d{4}$') THEN

        TO_CHAR(TO_DATE(TDATE, 'DD/MM/YYYY'), 'YYYY-MM-DD')

        WHEN REGEXP_LIKE(TDATE, '^\d{2}/\d{1}/\d{4}$') THEN

        TO_CHAR(TO_DATE(TDATE, 'DD/MM/YYYY'), 'YYYY-MM-DD')

        WHEN REGEXP_LIKE(TDATE, '^\d{1}/\d{2}/\d{4}$') THEN

        TO_CHAR(TO_DATE(TDATE, 'mm/dd/YYYY'), 'YYYY-MM-DD')

    ELSE


      'Invalid Date Format'

  END AS converted_date

FROM DATE_CHECK;


Youtube Video: https://youtu.be/L27_YSk1x4Q

#dbt #mysql #dateformat #realtimescenario

Post a Comment

0 Comments