Click here to Skip to main content
15,885,546 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
Hi,

I try to read a web page as follow in code, it works. The problem is that the web page has two section, main body and answer pane, The following code is only report the main body source code, and it has a error message "Your browser doesn't support iframes" in the answer pane section. However, in the activated web page, click on the answer pane section, and manually review source. The source code report is correct. In C#, How can I control the webpage and focus on the answer pane section before call webClent DownloadString() to get right source text.

It is appreciated for your reply and time.

RQ

using System.Windows.Forms;
using System.Net
...
System.Net.WebClient wc = new System.Net.WebClient();
string webData = wc.DownloadString("http://start.csail.mit.edu/answer.php?query=Who+is+the+41th+president+in+USA");

What I have tried:

I search internet, and try different code, but no luck.
Posted
Updated 8-Jul-16 2:18am
Comments
Beginner Luck 7-Jul-16 2:23am    
use htmlagilitypack. just google htmlagilitypack

Try this

C#
private static string ReadsourceCode(string Url)
{
string data="";
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(Url);
HttpWebResponse response = (HttpWebResponse)request.GetResponse();

if (response.StatusCode == HttpStatusCode.OK)
{
  Stream receiveStream = response.GetResponseStream();
  StreamReader readStream = null;

  if (response.CharacterSet == null)
  {
     readStream = new StreamReader(receiveStream);
  }
  else
  {
     readStream = new StreamReader(receiveStream, Encoding.GetEncoding(response.CharacterSet));
  }

  data = readStream.ReadToEnd();

  response.Close();
  readStream.Close();
}
return data; 
}

call like this
C#
var source =ReadsourceCode("http://start.csail.mit.edu/answer.php?query=Who+is+the+41th+president+in+USA");
 
Share this answer
 
An iframe is shown as a page within a page by your browser, it is a visual trick to make two pages look like one. You'll have to parse the html of the page you've downloaded to find the iframe, then read the "src" element (you can do all this using agility pack, or just plain text manipulation or regex). You then need to issue a second request to the page in src just as you did your initial page, you'll probably have to append the domain name to it (http://start.csail.mit.edu/justanswer.php?query=....).
 
Share this answer
 
Comments
RQ yang 8-Jul-16 15:53pm    
Thank you, you are right!

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900