Click here to Skip to main content
15,885,914 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
Is it possible to update ~5000 columns/second in sql server(either using Wide tables or columns spread across multiple tables); If yes then what approach should i follow?

Thanks in advance
Posted

It's kinda hard to answer that question - if all the columns are in one table, then it's just a single update statement and the speed is solely a factor of the hardware performance of your databse server.

If the columns are spread across a number of tables, I'd expect to use a stored procedure that did the update work, and received all those parameters from the application. But, probably, you can't pass 5000 parameters to a stored procedure. So, now you're having to break the command up into multiple (maybe even 100 or more) distinct invocations. Heavy-ish on the application and the network, but not so heavy as to push the work over 1 second.

Now, you have to look at 100+ commands, each passing related data to the database - what's the best architecture. You probably need to ensure that the row(s) that will receive that data don't present a partial/incomplete picture to the app(s) that read that/those rows.

You could run the commands in a transaction, and have the application(s) that read the row use a safe isolation level.

What I'd do, however, is create a working table that will temporarily hold the data from the application; 100 commands can insert into this table without affecting the state of the rest of the database. Each row inserted will have an application-generated pseudo-transaction id. When all data is transferred, a final command invokes a stored procedure, taking perhaps just one parameter (the pseudo-transaction ID), perhaps more (row identifier(s) for the destination rows). That stored procedure begins a database transaction, moves the data from the holding table into its final destination, deletes the data from the holding table and commits the transaction.

This approach removes the application/network round-trip time from the elapsed time within the database transaction. You'll still need to use an appropriate isolation level for the consumers of the data, but performance will be better.

APPROACHING THE PROBLEM FROM A DIFFERENT DIRECTION...
5000 values to insert into a single row (or set of related rows) is VERY unusual. Now, perhaps your application is just that unusual and its exactly the right thing to do. Perhaps, however, there are ways of reorganising your data flow, or data storage architecture, that simply makes the problem go away.

If you're receiving a steady flow of data, then batching it into 5k items in a batch is perhaps unnecessary.

Storage-wise; it may be better to store the 5k values as 5k rows in a table with columns such as: ParentEntity (a foreign key to the owning object), Name, Value. Now, you just don't need a super-wide table (rarely a good idea, anyway) and you can send the data to the database in smaller chunks.

There may be an intermediate data architecture - maybe your 5k values can meaningfully be grouped into subsets of related data which can (and perhaps, should) be sent to the database in groups.

HTH,
Chris
 
Share this answer
 
Comments
thatraja 17-Oct-13 6:27am    
Nice, 5!
You can use sqlite to perform this operation.
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900