Click here to Skip to main content
15,881,139 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
So, I have been given a task at work to create a data migration tool for the new version of my company's software. The purpose of this tool is to help the clients that use the software migrate data from table(s) in the old database to the new database.

I have already worked out the mapping for the whole process but the challenge I am having is, the old and the new system have both got relationships to other tables so I have to migrate both the data and the relationships from the old system. To make matters even more annoying the actual table in the old database which this migration tool will get data from, has a couple of columns that indicate if a Record belongs to a certain category and this columns have a 1 or 0 as data to show when a record belongs to that category or not.However, the new system is more normalized and we have a Category_Table (which holds the category description and category_type ID) and Category_Types table (which hold the Record ID, and category_type ID). These sort of complex issues is getting to my head and is mentally exhaustive.

Now the exact kind of help i need is, from someone who has built a data migration tool before or something similar to give me some advise on how to start this project and maybe several tips that could be useful to me or even tutorials. I am still new to enterprise software development and still trying to get use to complex projects so any helpful tip is appreciated.

NB: The database server we use is Sybase SQL Anywhere 16 while the old system is Sybase SQL Anywhere 10
Posted
Updated 8-Oct-13 5:06am
v2
Comments
Prasad Khandekar 8-Oct-13 10:48am    
Hello,

Every applications database is different and there is no single tool to get it done without custom development. You already have the necessary know how about your old and new database and various relationships between the tables. Since your old schema is using relations in a different manner (not via foreign keys as it appears from the example you have provided) you will have to do it all manually.

To begin with you will have to isolate static data (master setup) starting with the table with least relationships and then read record from old table and insert it new table possibly with additional foreign key values. Repeat it for each table to complete the migration.

Please remember though the program you will developing will not be usable with that database pair only.

Regards,
Regards,
Uzoma Umekwe 8-Oct-13 11:01am    
Hi Prasad
Just to clear things.The old schema uses relations and it is via foreign keys. For now, the purpose of the tool is to migrate details from a Table called NAMESTable in the old system to several tables in the new system according to how the data has been mapped. The NAMESTable in the old system just contains too much information but in the new system we have normalized the database design so that all the data in NAMESTable can be split into several tables. This is the current situation
Prasad Khandekar 8-Oct-13 11:12am    
Perfect. From your explanation it means that it's very likely that a row from one table in the old schema may result in rows getting created in multiple tables of the new schema & that will be your translation/migration logic. So as I said start with a table with lowest relations and transfer data from it row by row into new schema table(s). You can very well do it using C#. You already have the necessary mapping info available with you. This will ensure that all your tables gets populated appropriately and there will be very few or none foreign key mismatches.

Don't try to write/create a generalized tool. Instead focus only on specific transformations applicable to your new & old schema only.

You can even use third party ETL tools like Clover, Talend Open Studio and of course the SSIS.

Regards,

One of the Best tools for data migration is SSIS (SQL Server Integration Service) which designed for Extract Transform Load(ETL) the data,
Check the links and start your creating your ETL package:

SSIS Overview - Part I[^]
Steps to Create your First SSIS Package[^]

http://technet.microsoft.com/en-us/library/ms169917.aspx[^]
 
Share this answer
 
Comments
Prasad Khandekar 8-Oct-13 11:13am    
My 5+
Mehdy Moini 8-Oct-13 12:59pm    
Your nice 5+ ;-)
You need to start this project with some research. And, one of the best technical means you can possibly get would be a sandbox where you can create databases from scratch and develop your technique without worries and waste of time.

It looks like your problem is called "welcome to the relational algebra word", sounds very usual. So, first of all you need to develop some mathematical model of mapping two databases with different database schemas in a semantic way. First, try to determine where the information on the schema and semantics is stored. The semantics is only stored in a head of a developer (maybe accompanied with some documentation, and somehow reflected in semantic application code), but a schema (metadata) is stored in a database itself. You can develop the code to retrieve it from the database, but it depends on the particular RDBMS.

Now, about the complexity of this problem itself. Did you ever heard of the "algorithm of the invention"? One of the approaches is: replace your problem, which seems to be too difficult, with the generalization of this problem. Naturally, if you solve a more general problem, it will automatically solve your current problem. But how it's possible that a more general problem, which is more general and by definition is at least as difficult as the particular one, can be easier for resolution? Wonderfully, there are so many cases when this approach helps. The difference is mental: looking at a particular problem clouds our vision by some particular detail which only look important but in fact are unrelated to the real problem and only distracts you from the real resolution. Your Category_Table (using the word "Table" in the name of a table is ridiculous: who invented it?! This is the same as if Microsoft gave different names to executable files: instead of "WinWord.EXE" it would be "WinWord_Application", and, instead of "IExplore.EXE" it would be "IEApplication.exe" :-); in future, never use such names) and Category_Types are just an example of cross-related tables.

It leads to another idea: try to abstract out from the semantics of the problem. If you did not know that the category tables contains "categories", what would you do? Probably, you would be able to formalize it, requiring some additional information, also abstracted from semantics: how some tables in old database is formally mapped onto some other tables in other tables of a new one?

And, finally, creation of a generalized problem, from the other hand, should lead to reduction of the size of the schema. This along can lead you to better understanding of how you can resolve the problem. Replace the model which currently involves tens (or hundreds?) of tables in each of the two databases, with more general one, which however operates with four tables in old database and five (or so) in the new one. Isolate principle complexity from complexity of big numbers, they are often unrelated.

One more "replacing" advice: mentally, replace your adjective "annoying" with "challenging". Consider every hassle as an opportunity.

—SA
 
Share this answer
 
v2
Comments
Prasad Khandekar 8-Oct-13 11:35am    
Excellent, 5+
Sergey Alexandrovich Kryukov 8-Oct-13 11:38am    
Thank you, Prasad.
—SA
Uzoma Umekwe 10-Oct-13 9:32am    
@Sergey Alexandrovich Kryukov. Thanks for this very good guide..It was definitely helpful. I have started building the tool now and gradually making progress.
Sergey Alexandrovich Kryukov 10-Oct-13 12:39pm    
That is great.
Good luck, call again.
—SA

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900