Click here to Skip to main content
15,886,519 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
Hi. I have a C# winform app which scrap data from some site. The data which the app extract, holds temporary in datagridview control. The app need to run once a day so when start to scrap data, the app check for duplicates and add just unique data if it found that is not in the datagridview control.

Now my question is: How I can speed this process of comparing the current string with all other strings?

Any ideas, hints etc.

Thanks, Igor
Posted
Comments
BillWoodruff 23-Aug-13 21:48pm    
The assumption I see you making here is that format of the web-page you scrape will not change: a dangerous assumption unless it's a web-site you have control of.

Different strategies might be used here, depending on the reliability of your knowledge of which fields you can rely on not changing, and which fields you expect to change frequently.

I don't know but it might be that binairy compairing strings can be faster, may the next commenters help you with this.
 
Share this answer
 
Use a
HashSet<string>


If duplicates are found it won't be added.

HashSet<string> deduped = new HashSet<string>();

deduped.Add("Test");
deduped.Add("blah");
deduped.Add("Test");

Console.WriteLine(deduped.Count); //outputs 2
 
Share this answer
 
v2
Actually I don't hold my data in some data structure, just temporary hold in datagridview.
Datagridview is like this:

Model Location .... .... .... and othere columns... Now if I have some current model I like to compare the name of the current model with all in the datagridview with the same name... If the model is bmw x6... find all bmw x6 inside the datagridview and compare the current with all other... if have the same name, but different location add in new row the unique model... This is my question about the comparing... I don't like to use one for loop which goes through all rows... I like to know if there is any quick method how to traverse the gridview....
 
Share this answer
 
Comments
Expert Coming 26-Aug-13 18:17pm    
Traversing the GridView is going to be a loop regardless. I would suggest using my solution above while adding the items, rather than add all the items and clean up afterward.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900