- Marek Vavrovic
Using Slowly Changing Dimension for incremental load in SSIS
There are three types of SCD:
Type 1 Fixed attribute. (can return error, if is updated)
Type 2 Changing attribute. (overwrites an old record)
Type 3 Historical attribute. (previous values are saved and marked as outdated)
In this example I'll be using tblEmployees as a OLTP / source table. This table will be updated and the updates will be sent to DimEmployees. This table is a part of the Data Warehouse.
code to create the tables:
IF EXISTS ( SELECT * FROM dbo.sysobjects WHERE id = OBJECT_ID(N'tblEmployees') AND OBJECTPROPERTY(id, N'IsUserTable') = 1 )
DROP TABLE tblEmployees
CREATE TABLE tblEmployees(EmpId int, FirstName varchar(50), LastName varchar(50), JobTitle varchar(50))
insert into tblEmployees values(100, 'Thomas', 'Allbutt', 'Software Engineer')
insert into tblEmployees values(101, 'Carl', 'Bigfoot', 'Junior DBA')
insert into tblEmployees values(102, 'Shayma', 'Cupper', 'BI Developer')
insert into tblEmployees values(103, 'Radim', 'Koller', 'Sales Representative')
insert into tblEmployees values(104, 'Adam', 'Klein', 'Marketing Manager')
IF EXISTS ( SELECT * FROM dbo.sysobjects WHERE id = OBJECT_ID(N'DimEmployee') AND OBJECTPROPERTY(id, N'IsUserTable') = 1 )
DROP TABLE DimEmployees
CREATE TABLE DimEmployees(Id int identity, EmpId int, FirstName varchar(50), LastName varchar(50), JobTitle varchar(50), StartDate datetime,
tblEmployees (EmpId) serves as a Business key and can be link to DimEmployees (EmpId).
LastName column will be set up as Changing Attribute. (changes will be overwritten. )
JobTitle column will be a Historical Attribute. (changes will be saved and marked using EndDate column to identify outdated records)
As a source table I am using tblEmployees (OLE DB Source). Next task will be Slowly Changing Dimension from Common. Double click on SCD to configure it.
Add a destination / target table DimEmployees. Select a dimension table to load and map columns in the transformation input to columns in the dimension table. All we need to do here is mark the Business key.
Select a change type for slowly changing dimension columns
FirstName will be fixed attribute, means this record should not change at all.
LastName is a Changing attribute, when tblEmployees will be updated, DimEmpoyees record will be updated after the updated record will be loaded.
JobTitle is a Historical attribute, if any change occurs, after the Package execution there will be a record about it in the EndDate column.
Using StartDate and EndDate columns to identify current and expired records.
Variable to set date values: using system variable StartTime to set the values in the StartDate col.
Finish the configuration and execute the package. It will insert all the records for source table tblEmployees into destination DimEmployees.
Updating record with Changing Attribute. I am going to update record EmpId 101, to change the LastName to "von Banhoff" in tblEmployees table. Then I will run the package and record in DimEmployees will be automatically updated.
set LastName = 'von Banhoff'
Updating Historical Attribute JobTitle. First I run the update statement in SQL server.
set JobTitle = 'Senior DBA'
After that I will run the SSIS package. New record will be loaded and the expired record will be marked with a date time entry.