Click here to Skip to main content
15,867,330 members
Please Sign up or sign in to vote.
2.50/5 (2 votes)
See more:
I am supposed to create a C# solution to compare the data (table, column and row) from two different databases and update the 2nd one as follows:
1. with just the rows that are updated recently (incase of duplicates)
2. check for new/updated column in 1st one and update/delete in 2nd accordingly or drop the table entirely and copy it from 1st one.
3. check for new/updated table in 1st one and update/delete in 2nd accordingly or drop the table entirely and copy it from 1st one.
So basically I have to synchronize the data between the two databases.

As of now, I have just got to connecting the database part. I am not sure how to proceed (should I be dumping the data from the database and checking or is there a simpler way?).!!!

Any help would be appreciated.

Thank you!
Posted

Dumping might be the easiest way to go here, although it can get data and time consuming quite quickly. So I'd suggest to just update and insert the changes even if it's a bit more complicated - especially if you're going to work on big tables.

For a start: Are the two databases on the same server? If that's the case, you might be able to handle this with SQL using something like this:

SQL
-- Update existing rows
UPDATE table1, table2
SET table1.col1 = table2.col1,
    table1.col2 = table2.col2,
    table1.col3 = table2.col3,
    ...
    table1.col_last = table2.col_last
    -- do not use the id field here!
WHERE table1.id = table2.id AND table1.time_stamp <> table2.time_stamp

-- Insert new rows
INSERT INTO table1 (col1, col2, col3, ... col_last) -- use id field here!
SELECT col1, col2, col3, ... col_last FROM table2
WHERE NOT EXISTS (SELECT * FROM table2 WHERE table2.id = table1.id)


I'm not familiar with MySQL and I cannot test it. So please do some testing before using this!

If you need two different connections, it get's ugly. I'd try to do something similar as the SQL statement implies to do:

  • Get lists of all ids and their timestamps:

    1. List A from table1 (target)
    2. List B from table2 (source)

  • Filter the lists to find:

    1. List1: ids that exist in A and B but differ in timestamp
    2. List2: ids in B that do not exist in A

  • Load all entries from table2 where id is in List1 and update those in table1
  • Load all entries from table1 where id is in List2 and insert those in table1


The reason behind loading only id and timestamp is to safe memory and query / transfer time so the code doesn't have to handle all the data just to determine where the changes are. In C# that could look something like this:

C#
class IdTsEntry
{
    public int Id { get; set; }
    public DateTime TimeStamp { get; set; }
}

using (DbCommand cmd1 = conn1.CreateCommand())
using (DbCommand cmd2 = conn2.CreateCommand())
{
    // Load target list's ids and timestamps
    cmd1.CommandText = "SELECT id, time_stamp FROM table1";
    List<idtsentry> A = new List<idtsentry>();
    using (DbDataReader reader = cmd1.ExecuteReader())
    {
        while (reader.Read())
        { // I assume the fields are set to NOT NULL
            A.Add(new IdTsEntry() {
                Id = reader.GetInt32(0),
                TimeStamp = reader.GetTimeStamp(1)
            });
        }
    }
    // Load source list's ids and timestamps
    cmd2.CommandText = "SELECT id, time_stamp FROM table2";
    List<idtsentry> B = new List<idtsentry>();
    using (DbDataReader reader = cmd2.ExecuteReader())
    {
        while (reader.Read())
        { // I assume the fields are set to NOT NULL
            A.Add(new IdTsEntry() {
                Id = reader.GetInt32(0),
                TimeStamp = reader.GetTimeStamp(1)
            });
        }
    }

    // Filter lists
    List<int> List1 = new List<int>();
    List<int> List2 = new List<int>();
    foreach (IdTsEntry b in B)
    {
        var a = A.FirstOrDefault(e => e.Id.Equals(b.Id));
        if (a == null) 
            List2.Add(b.Id); // b.id not in A -> new row
        else if (!a.TimeStamp.Equals(b.TimeStamp))
            List1.Add(b.Id); // b.id in A but other timestamp -> altered row
    }

    // Update altered rows
    string ct1 = "SELECT col1, col2, ... col_last FROM table2 WHERE id = {0}";
    string ct2 = "UPDATE table1 SET col1 = @val1, col2 = @val2, ... " + 
        "col_last = @val_last WHERE id = {0}";
    foreach (int id in List1)
    {
        // Read all entry values into parameters
        cmd1.CommandText = String.Format(ct1, id);
        cmd2.Parameters.Clear();
        int i = 0;
        using (DbDataReader reader = cmd1.ExecuteReader())
        {
            if (!reader.Read()) continue;
            for (int n = 0; n < reader.FieldCount; n++)
            {
                cmd2.Parameters.Add(String.Format("val{0}", ++i),
                    reader.IsDBNull(n) ? DBNull.Value : reader.GetValue(n));
            }
        }
        // Update row
        cmd2.CommandText = String.Format(ct2, id);
        cmd2.ExecuteNonQuery();
    }
    // Insert new rows
    ct1 = "SELECT id, col1, col2, ... col_last FROM table2 WHERE id = {0}";
    ct2 = "INSERT INTO table1 (id, col1, col2, ... col_last) " +
        "VALUES (@val1, @val2, @val3, ... @val_last)";
    cmd2.CommandText = ct2;
    foreach (int id in List2)
    {
        // Read all values into parameters
        cmd1.CommandText = String.Format(ct1, id);
        cmd2.Parameters.Clear();
        int i = 0;
        using (DbDataReader reader = cmd1.ExecuteReader())
        {
            if (!reader.Read()) continue;
            for (int n = 0; n < reader.FieldCount; n++)
            {
                cmd2.Parameters.Add(String.Format("val{0}", ++i),
                    reader.IsDBNull(n) ? DBNull.Value : reader.GetValue(n));
            }
        }
        // Insert row
        cmd2.ExecuteNonQuery();
    }
}
</int></int></int></int></idtsentry></idtsentry></idtsentry></idtsentry>


Like the SQL statement: I haven't tested this and I wrote that quickly. Hopefully it'll help you though. You can also alter the WHERE statements to cover only the date range after the last sync - that'll speed things up.
 
Share this answer
 
Comments
RevathySanthanam 9-Apr-15 8:16am    
Wow, That's really an amazing solution. Thanks alot Bjørn.!! So far I have been working with both databases in same server only.
and I have been trying to make the code you gave work and so far so good until now.. I am unable to understand what this line will do: <pre>
"// Read all entry values into parameters
cmd1.CommandText = String.Format(ct1, id);" </pre>
because I have an error thrown when I try to execute the Command that's following that line..!
The Syntax error points to this line (at id = {0} exactly):

// Update altered rows
string ct1 = "SELECT col1, col2, ... col_last FROM table2 WHERE id = {0}";

I'm not sure what the issue is because the value for id from {0} is something that is not present in any of the tables. So I don't know how and from where that value is assigned.
I will post the code I re-framed according to my project from yours, for better understanding. Could you please help me out?
Bjørn 9-Apr-15 8:56am    
Hello, there must be something else going wrong because that line simply stores a string in a variable - there's no processing there. Posing the code might help and additionally the Exception that you get.

cmd1.CommandText = String.Format(ct1, id);

This line is meant to set the command text to a SELECT command that will give back the one data row for the current id. As I'm looping through the updated element's ids the command basically stays the same - only the WHERE clause has to be modified so I prepare the SELECT command with the placeholder {0} and the String.Format() is meant to replace it with the current id.

Hope that brings some light into that part of the code :D

Another issue that might arise is that I use DbCommand, DbDataReader and so far. I've only worked with MS SQL, SQLite and SqlCe so far and there this seems to work, but there might be some differences when working with MySQL?!

Greetings!

Bjørn
RevathySanthanam 9-Apr-15 9:10am    
Yeah, i gathered that something else is the issue over there. but i have no clue as to how the id could be something that is not present in any of the tables. So how do i check from where that value is assigned? :( been stuck with it for long time!
and yes, that was a very nice explanation for 'String.format', so much better than Google. I understood that now. Thank you.
oh and yeah, DBDataReader is simply MySqlDataReader in MySql..
So, do you have any idea about the 'Id assignment issue' from that Select Statement..!! (the code is posted below and the exception is a MySql Syntax Error mentioned above) :( please!!
Bjørn 9-Apr-15 9:19am    
Hello again, and sorry, I haven't seen your second post. The first thing that struck my eye is that there don't seem to be spaces between FROM, tbl2 and WHERE and that you are using @val7 although it must be @val3 the line below. Please take care you're creating a correct SQL command because the Exception does seem to point that way. There are a few more cases where there seem to be spaces missing when you use your variables. It seem you're getting

SELECT variante,charge,DATE_FORMAT(fdate, '%Y-%M-%D %H:%i:%s') FROMtbl2WHERE fid = {0}

where it should be

SELECT variante,charge,DATE_FORMAT(fdate, '%Y-%M-%D %H:%i:%s') FROM tbl2 WHERE fid = {0}

To avoid further errors, try a more simple approach: Instead of using this line

cmd1.CommandText = String.Format(ct1, id);

use that one:

cmd1.CommandText = "SELECT variante,charge,DATE_FORMAT(fdate, '%Y-%M-%D %H:%i:%s') FROM " + tbl2 + " WHERE fid = " + id.ToString();

and instead of using

cmd2.CommandText = String.Format(ct2, id);

use

cmd2.CommandText = "UPDATE " + tbl1 + " SET variante = @val1,charge = @val2, fdate = @val3 WHERE seriennummer = " + id.ToString();

That should do the same and you don't need ct1, ct2 and that String.Format() anymore. The same can be done in the second loop.
RevathySanthanam 9-Apr-15 10:28am    
wow, i did not notice the spacing issue :) Thank you.
So i fixed the errors i made and the ID assignment issue is gone.!! but i have a new problem now.!
it keeps throwing me Nulldata Exception in the first while( here:

// I assume the fields are set to NOT NULL
A.Add(new IdTsEntry()
...)

but there are no Null values in my db, i even set the columns to NotNull property. Not sure why.. :( any help plzz??
hello Revathy

if You have an identity column and a time-stamp(which contains time of row update) column in table then...
you can retrieve list of rows which are updated after a particular time stamp
and then you can update or add new rows in other db table [keep id of row same in second db].
you can set sql job which runs on specified time and synchronize data.

Happy Coding!
:)
 
Share this answer
 
Comments
RevathySanthanam 2-Apr-15 4:30am    
Thank you Aarti.
yes, i have ID and Timestamp columns. and yeah, that is how i planned to do that part too.
but the thing is, how do i check for new updates in Db1 and perform the respective changes to Db2 in C#?
any ideas?
Aarti Meswania 2-Apr-15 5:14am    
create store procedure for insert and update records.
and set an event scheduler which automatically run on desired time and sync. data.
please visit link..
https://dev.mysql.com/doc/refman/5.1/en/events.html
RevathySanthanam 2-Apr-15 7:22am    
but i am unable to do an insert statement with update in MySQL. how can i do that? i tried dis to get the most recently updated id,
SELECT distinct id
FROM `table1` order by `date` desc limit 1;
now how can i do an insert on table2 (has same schema as table1) with this id and the respective other columns?
table2 is the copy of unique data from table1.
any inputs?

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900