My Blog List

LOVE IS LIFE

#htmlcaption1 #htmlcaption2 Stay Connected

Pages

Tuesday, June 17, 2014

Populating Slowly Changing Dimensions with SQL Server 2008 MERGE statement

Populating Slowly Changing Dimensions with SQL Server 2008 MERGE statement

MERGE statement

MERGE is a new feature in SQL Server 2008 that provides an efficient way to perform multiple DML operations. Using MERGE statement INSERT, UPDATE or DELETE can be performed on a target table based on the result of a join with a source table.
The MERGE statement has three WHEN clause, each performs a specific DML operation on a given row in the result set
WHEN MATCHED allows you to UPDATE or DELETE the given row in the target table for every row that exists in both source and target table
WHEN NOT MATCHED [BY TARGET] allows you to INSERT a row into the target table for every row that exists in the source table but not in the target
WHEN NOT MATCHED BY SOURCE allows you to UPDATE or DELETE the given row in the target table for every row that exists in target table but not in source table
You can also specify a search condition with each of the WHEN clause to choose the DML operation to be performed on the selected row
OUTPUT clause of the MERGE statement includes a virtual column $action which is used to identify the DML action performed in each row of the result set

Slowly Changing Dimensions

The data in the Slowly Changing Dimensions (SCD) changes periodically. The SCD may be a customer dimension, whereas any new customers can be added or the contact information of the existing customers may be changed.
Earlier this type of dimensions can be handled using multiple DML statements. With MERGE statement in SQL Server 2008 populating SCD can be achieved using a single statement.
Let us take DimCustomer as a dimension table which tracks the customer contact information (phone) for each customer of a particular retail store. Every day thecustomers table in the transactional system is updated with changes to the existing customer or inserting a new customer record. At the end of each month the changes in the customer table are to be applied in the DimCustomer table. A row has to be inserted for each new customer. Existing customers whose contact information changed need to be updated with IsCurrent as 0 and a new record for the existing customer has to be inserted with current contact information.
CustomerKey
CustomerName
Phone
IsCurrent
5011John Smith5650 578 632
1
5012Patrick Brown7451 241 584
1
Table 1: DimCustomer

CustomerKey
CustomerName
Phone
5011John Smith8845 001 863
5013Jimmy7554 846 563
Table 2: Customer
 The following Transact-SQL statement updates the Phone for the customer John Smith and adds a new record for Jimmy in the DimCustomer table.
 INSERT INTO DimCustomer(CustomerKey, CustomerName, Phone, IsCurrent)
  SELECT CustomerKey, CustomerName, Phone, 1
  FROM
    (
        MERGE DimCustomer as trgCustomerDim
        USING Customer AS CustSrc
        ON (trgCustomerDim. CustomerKey = CustSrc. CustomerKey and trgCustomerDim.IsCurrent = 1)
        WHEN MATCHED THEN
            UPDATE SET trgCustomerDim.IsCurrent = 0
        WHEN NOT MATCHED THEN
            INSERT VALUES (CustSrc. CustomerKey, CustSrc. CustomerName, CustSrc. Phone, 1)
        OUTPUT $action, CustSrc. CustomerKey, CustSrc. CustomerName, CustSrc. Phone
    ) AS Changes(action, CustomerKey, CustomerName, Phone)
  WHERE action = ‘UPDATE’;

The output of MERGE statement is
$action
CustomerKey
CustomerName
Phone
UPDATE5011John Smith8845 001 863
INSERT5013Jimmy7554 846 563
The OUTPUT of the MERGE statement is filtered with the action ‘Update’ to insert the new record for the changes in existing customer details. This will make the current information as available and also maintains the old information of the customer.
 The DimCustomer table after processing the MERGE statement would be like this
CustomerKey
CustomerName
Phone
IsCurrent
5011John Smith5650 578 632
0
5012Patrick Brown7451 241 584
1
5013Jimmy7554 846 563
1
5011John Smith8845 001 863
1

No comments:

Post a Comment