• Marek Vavrovic

SSIS Append data

  1. Union All Transformation

  2. Merge

  3. For Each Loop Container


Appending data using Union All Transformation


Sorted input not requires

Number of inputs unlimited

Sorted output does not generate


I have 2 tables I want to append. Each table is in a different database on the same server instance.



OLE DB Source

Both inputs file using ole db source connection managers.

Union All Transformation

I have 2 inputs and 1 output

Flat file destination

- create you .csv file on the file system

-connect to it.

- check or uncheck overwrite the data (depends on what you want)

-Column names in the first data row

-Code page 1252


run the package


see the result




Merge


I want the append the same tables as before, this time using the Merge transformation. The final solution is like Union All transformation. I am going to have 2 inputs and generate 1 output.


The main differences

Union All Merge

Sorted input: not requires requires

Number of inputs: unlimited only two

Sorted output: not generates generates


Right click on the OLE DB Source > Show Advanced Editor

Merge transformation requires sorted inputs.

set property IsSorted: True


Go to Output Columns > customer_id > SortKeyPosition = 1

Repeat the same process for Mexico data source.

Merge transformation

Map the columns. This will be done automatically if both sources using the same attribute names.

OLE DB Destination

You can choose destination you want. You can send data into DB or a flat file.



For Each Loop Container



I will be using For each ADO Enumerator to loop over database and table name in SSISObject table.


Create 4 variables


SSISobj holds table SSISObject

DBName holds values from DBName column in SSISObject table

TableName holds values from TblName column in SSISObject table

Increment variable will be used on the end for Execute SQL Task. Will come back to this variable later.

Using For each container we will be iterate between 2 databases.

  1. create connection manager to one of them

  2. go to expressions

  3. Use InitialCatalog property (stands for database name)

  4. use variable @DBName (DBName value will change on the run time)

Execute SQL Task

ResultSet: Full Result set

SQLStatement: Select * from SSISObject

Even though the tables come from 2 different databases to connect to a specific table I'll be using the same connection manager and its value will change during the run time process.



This will return a table. We need to go to Result Set a map that result to an object variable I create in the beginning.


Foreach loop container

choose Foreach ADO Enumerator

supply a source variable: SSISobj


Variable Mapping

We need to map the column names from SSISobj variable which holds table SSISObject

Data Flows

for Flat file destination loading


I've added three data flows inside the Foreach loop container.

Data flow 1 is just a supporting data flow and doesn't contain any transformations, other words is completely empty

Data flow 2: Destination table will be a flat file. First row as headers will be applied.

Data flow 3: Destination table will be a flat file. Do not check use first row as headers. I use a separate flat file connection manager to achieve this. DF 2 and DF3 will have its own flat file connection managers.

Data Flow: Headers

Using variable name for a source table. Table name will change on the run time.

Flat file Destination

do not overwrite this file and allow headers

The first flat file connection manager is created. This one will allow headers.



Now I am going to use Data Flow: No Headers to create a Flat file connection manager for destination that won't use first row as headers.

The second table is in DBTwo database, but this connection will work fine because I am using and expression for InitialCatalog property, so it will change on the run time. And I am using the same variable for table name.


The second flat file connection manager is pointing the same file Results.csv. I won't be loading the headers.




Add Execute SQL Task, double click on the Precedence Constraint coming out from Data Flow Headers and use Logical OR


I am going to use Increment variable inside the Execute SQL Task

Use SQLStatementSource property to create an expression:


" Select "+ (DT_WSTR, 10) @[User::Increment] +" +1 "


this expression will incrementally increase its value each time the foreach loop container iterates.

Its initial value is set to 0, after the first iteration value will increase about 1.

Map the Result set to the Increment variable


General tab

ResultSet: Single row

This is the funniest part. When we run the package the first iteration will go on the left side. Increment variable value will be equal to zero and the Flat file will be loaded from DBOne database, with Mexico table data. Then the value will change from zero to one (Select 0+1) and the data Flow on the right side will execute. Data from database DBTwo and table called Canada will be loaded into the flat file. This time with no headers.


Running the package




Conclusion: It is little bit easier with Union All transformation.

66 views0 comments