Click here to Skip to main content
15,885,366 members
Please Sign up or sign in to vote.
3.00/5 (3 votes)
See more:
i want to do screen scrapping. currently i am using web client in c# to get the page source of the web page. but the problem is that i need to press some buttons in order to get proper
data. i cant use the selenium cause selenium use web browser like firefox and having visible interface to the end user.
the major problem is that i want to hide activity performed by selenium or any third party component during screen scrapping from the end user.
can suggest me accordingly ?
Posted
Comments
Marco Bertschi 18-Feb-13 8:54am    
Why do you want to do this? Why should the user not know that his screen is being scrapped?
Sandeep Mewara 18-Feb-13 9:26am    
Why should the user not know that his screen is being scrapped? - Exactly.

Hiding things from user puts application in suspect category.
Manfred Rudolf Bihy 18-Feb-13 10:22am    
I don't think there is anything suspicious going on. I think OP wants Web Page Scraping, which OP tried via Selenium, but with the drawback of having to deal with its user interface. :)
Sandeep Mewara 18-Feb-13 10:26am    
Yep, possible. Thus, it was just the comment and nothing else. :)

A second opinion at times helps. Thanks.
Sergey Alexandrovich Kryukov 18-Feb-13 11:29am    
Of course, there is nothing suspect. This would be just a crime.
—SA

1 solution

Since you mentioned Selenium and web client, I'll just go on and assume you were not talking about screen scraping (note that there is only one p in that word). Selenium is a tool that will do that, but obviously one with a user interface. Since you have not stated your ultimate goal, I can't really tell if you really need Selenium. Web/Page scraping can be done quite easily per code with the Html Agility Pack[^]. This is a free and great implementation which I have used myself before and there are also quite a few of our members who are using it.

If the pages you are using rely on JavaScript in order to have any data to be scraped, you'll probably need to use a hidden webbrowser control to fully load the page in the background and then operate on the content once it has been properly loaded.

Regards,

— Manfred
 
Share this answer
 
Comments
Sandeep Mewara 18-Feb-13 11:15am    
My 5 for the answer and probably understanding the question correctly. :)
Manfred Rudolf Bihy 18-Feb-13 11:18am    
Thank you Sandeep!
Marco Bertschi 18-Feb-13 11:59am    
My 5 for providing a useful answer and showing a workaround on the "Hide me from user"-thing.
Sm.Abdullah 19-Feb-13 13:34pm    
Manfred R. Bihy !
thnx manfred for your reply..
i used a hidden browser control too. but there is also a strange behavior or problem i found.
plz take a look on this rough piece of code.
//it will work fine form me
htmlElementCollection collection = browser.getElementbyTagName("input");
collection[0].invokeMember("click"); //supposed the desired input field.
// failed same piece of code against Div.
htmlElementCollection collection = browser.getElementbyTagName("DIV");
collection[0].invokeMember("click"); //supposed the desired div element.
it will show me nothing.
if i click on identified div by mouse then it will show a popup light box.
it is assured that i am calling invoke member on right div.
can you suggest me if something going wrong. ?
Manfred Rudolf Bihy 19-Feb-13 13:59pm    
Are you sure your div is the first one in the collection returned by getElementsByTagName. How would you know?
Just being curious. ;)

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900