Click here to Skip to main content
15,885,873 members
Articles / Database Development / SQL Server / SQL Server 2012
Article

SSIS Import Text File Using Validation

Rate me:
Please Sign up or sign in to vote.
5.00/5 (5 votes)
6 Sep 2015CPOL8 min read 19.9K   419   3  
Import text file data to SQL database using foreign key relation and validation.

Introduction

In this tip we will see how to import a text file with foreign key relation and validation by using SSIS. All the error records will stored in a separate table.

Background

Before starting this tip, I recommend you to read the previous post: SSIS - Import Text File

Using the Tools and Code

  • Database : SQL Server 2012 R2
  • BI : SQL Server Data Tools 2010
  1. Create sample file
  2. Create database tables
  3. Create SSIS package
  4. Sample file analysis & expected output
  5. Final output

Let's Get Started

Part 1: Create Sample File

1. In this example we are going to use a student, details text file.

Student Name, Register No and Department Name are the columns.

Image 1

Part 2: Create Database Tables

1. Open SQL Server Management Studio

2. Create a database "TestDB"

3. Create a table "SSIS_Department" with the following columns

  • DepartmentId
  • DepartmentName

Set DepartmentId as a primary key and set Identity Specification is "yes"

C++
CREATE TABLE [dbo].[SSIS_Department](
	[DepartmentId] [bigint] IDENTITY(1,1) NOT NULL,
	[DepartmentName] [varchar](10) NOT NULL,
 CONSTRAINT [PK_SSIS_Department] PRIMARY KEY CLUSTERED 
(
	[DepartmentId] ASC
)
) ON [PRIMARY]

4. Create a table "SSIS_Student" with the following columns

  • StudentId
  • StudentName
  • RegisterNo
  • DepartmentId

Set StudentId as a primary key and set Identity Specification is "yes"

Set SSIS_Deprtment table DepartmentId as a foreign key of SSIS_Student table DepartmentId

C++
CREATE TABLE [dbo].[SSIS_Student](
	[StudentId] [bigint] IDENTITY(1,1) NOT NULL,
	[StudentName] [varchar](50) NOT NULL,
	[RegisterNo] [varchar](5) NOT NULL,
	[DepartmentId] [bigint] NOT NULL,
 CONSTRAINT [PK_SSIS_Student] PRIMARY KEY CLUSTERED 
(
	[StudentId] ASC
)
) ON [PRIMARY]

ALTER TABLE [dbo].[SSIS_Student]  WITH CHECK ADD  CONSTRAINT [FK_SSIS_Student_SSIS_Department] FOREIGN KEY([DepartmentId])
REFERENCES [dbo].[SSIS_Department] ([DepartmentId])

ALTER TABLE [dbo].[SSIS_Student] CHECK CONSTRAINT [FK_SSIS_Student_SSIS_Department]

Part 3: Create SSIS Package

1. Start the SQL Server Data Tools (Microsoft Visual Studio 2010 Shell)

2. Go to File -> New -> Project

3. Select  Business Intelligence, from the Installed Templates option and project as Integration Services Project and give the Project Name.

Image 2

4. Drag two data flow task from tool box to Control Flow Tab. For easy understanding change the first data flow task as "Department" and second data flow task as "Student".

Image 3

5. Create a Flat File Connection to read text file and OLE DB Connection to insert ext file data to database. Right click on Connection Manager Pane and select New Flat File Connection.

Image 4

  • Flat File Connection Manager General settings: Specify a connection manager name, specify the sample file location earlier we created and leave the remaining settings.

Image 5

  • Flat File Connection Manager Columns settings: Columns setting will display all the available columns in the sample file.

Image 6

  • Flat File Connection Manager Advanced settings: In the advanced settings, we can add validation. Here we are going to add a validation to RegisterNo column. Select RegisterNo column and set the OutputColumnWidth to 5.

Image 7

  • Preview is used to view the sample file. Then click Ok to create the connection manager.

6. Now let's create the OLE DB Connection manager. Right click on Connection Manager Pane, create a new OLE DB Connection.

  • In the connection manager window specify the Server name.
  • Since I am using my local server I have used ".", otherwise need to specify the server name.
  • Select "Use SQL Server Authentication" option and provide user name and password.
  • In "Select or enter a database name" select the "TestDB" database. Earlier we have created the tables in TestDB database.
  • Verify the connection established successfully or not by Test Connection button.

Image 8

  • Finally click Ok button to create OLE DB Connection manager.

 

7. In this example we are going to load student data from text file. But in our database design SSIS_Student table using DepartmentId. It means all the distinct DepatmentName available in the text file, insert into SSIS_Department table, then using a lookup function to find the DepartmnetId of the DepartmentName to insert the student details.

8. Earlier we have included a validation, RegisterNo restricted to five characters. If a record's RegisterNo with more than five characters, it will load into the error data table.

9. Ok, let's finish the Department part.

  • Right click Department data flow task and select Edit. It will go to Data Flow tab.
  • Now place a Flat File Source from the tool box. Right click on Flat File Source and edit. It will show the Flat File Source Editor.

Image 9

  • In the connection manager setting: select the Flat file connection manager name in the dropdown.
  • In the Columns setting: It will show all the available External Columns. Now we are going to deal with department name only. So select DepatrmentName

Image 10

  • Finally Click Ok button.

10. Now place a Sort Transformation from the SSIS tool box. Connect flat file source output to Sort transformation input.

Image 11

  • Right click on Sort transformation and edit. It will show the Sort Transformation Editor. Specify any sort type and sort order and select "Remove rows with duplicate sort values" checkbox.

Image 12

  • Finally click ok button.

11. Now place an OLE DB Destination to insert the department names.

Image 13

  • Right click on OLE DB destination and select Edit. It will show the OLE DB destination editor.
  • In the connection manager setting: Select "OLE DB connection manager" name from the dropdown list, Select "Data access mode" to "Table or view - fast load", Select "Name of the table or the view" to "SSIS_Department".

Image 14

  • In the mapping setting: map the respective input column to destination column.

Image 15

  • Finally click ok button.

12. Now we will start student details part.

  • Go to Control Flow tab, right click Student data flow task and select Edit. It will go to Student Data Flow tab.
  • Now place a Flat File Source from the tool box. Right click on Flat File Source and edit. It will show the Flat File Source Editor.
  • In the connection manager setting: select the Flat file connection manager name in the dropdown.
  • In the Columns setting: It will show all the available External Columns. Select all the available external columns.

Image 16

  • In the Error Output setting : Set all the columns Error and Truncation to Redirect row

Image 17

  • Finally Click Ok button.

13. This time Flat File Source has two outputs. The Blue color indicates the success records and the Red color indicates the error records.

  • Now place an OLE DB Destination to insert the error records. Chnage as "Error Destination". Connect the Red color output of the flat file source to input as the Error Destination.

Image 18

  • Right click the Error Destination and select edit. It will show the OLE DB destination editor.
  • In the connection manager setting: Select OLE DB connection manager name from the dropdown list, Select data access mode to "Table or view - fast load". In the "Name of the table or the view" dropdown we need to select the table to insert the error records, but still we didn't create any table to insert the error records.

Image 19

  • To insert the error records, click "New..." button next to the "Name of the table or the view" dropdown. It will show the table structure to insert the error records.

Image 20

  • Copy the code and use SQL management studio to create a table. Also include an primary key filed.
C++
CREATE TABLE [SSIS_Error] (
    [ErrorId] [bigint] IDENTITY(1,1) NOT NULL,
    [Flat File Source Error Output Column] varchar(max),
    [ErrorCode] int,
    [ErrorColumn] int,
	CONSTRAINT [PK_SSIS_Error] PRIMARY KEY CLUSTERED 
(
	[ErrorId] ASC
)
)
  • After created "SSIS_Error" table, set the "Name of the table or the view" dropdown to "SSIS_Error".

Image 21

  • In the mapping setting: map the respective available input column to available destination column.

Image 22

  • Finally Click Ok button.

14. We have finished the error records capturing part. Now we are going to process the success records.

  • To process the success records we need to do a lookup transformation to find the DepartmentId for the DepartmentName.
  • Place a lookup transformation from the tool box and connect the blue color output of the flat file source to input as the lookup transformation.

Image 23

  • Right click on lookup transformation and select edit. It will show the lookup transformation editor.
  • In general setting: select Cache mode as "Full cache" and Connection type as "OLE DB connection manager"

Image 24

  • In Connection settings: Select OLE DB connection manager name from the dropdwon and set the Use a table or view to SSIS_Department table.

Image 25

  • In the columns setting: Map the available input columns "DepartmentName" to available lookup columns "DepartmentName" and select available lookup columns "DepartmentId" checkbox.

Image 26

  • Finally Click Ok button.

15. Now place an OLE DB destination and change the name as "Student Destination". Connect the lookup match output to Student Destination input.

Image 27

  • In the connection manager setting: Select OLE DB connection manager name from the dropdown list, Select data access mode to "Table or view - fast load". In the "Name of the table or the view" dropdown select "SSIS_Student".

Image 28

  • In the mappings setting: map the respective input column to destination column.

Image 29

  • Finally Click Ok button.

Part 4: Sample File analysis & Expected Output

We have finished our SSIS package design. Before execute this package, take a look on our input sample file.

Image 30

  • In our package Department part will execute first.
  • So CSE, ECE, EEE and IT data will insert into the SSIS_Department table. (Four records)
  • Then Student part will execute.
  • While processing student records, we have placed the RegisterNo length restriction to five characters. But in our sample file last record (Watson) RegisterNo exceeds the limt. So that record should not load into SSIS_Student table. It should be in SSIS_Error (One record) table.
  • Remaining student records will insert into the SSIS_Student table with DepartmentId. (Four records)

Part 5: Final Output

Run the SSIS package. If we don't encounter any error in the project, the result will be green color. If we have any issue, the result will indicate in red color.

Error SSIS package execution.

Image 31

By viewing the progress tab, we can findout the error.

Image 32

Total SSIS package execution.

Image 33

Department part execution output.

Image 34

Output showing that, from Flat file source five departments returned, after apply sort and remove duplicate sort values, four records are passed to the destionation to insert.

Student part execution output.

Image 35

From the flat file source, out error record details passed to "Error Destination", remaining four student details are passed to lookup transformation to find the DepartmentId and then passed to the destination to insert.

Lets chek in the database.

SSIS_Department records.

Image 36

SSIS_Error records.

Image 37

SSIS_Student records.

Image 38

Image 39

History

1st Version : 2015-09-06

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



Comments and Discussions

 
-- There are no messages in this forum --