Click here to Skip to main content
15,115,119 members
Please Sign up or sign in to vote.
1.00/5 (2 votes)
See more:
Hello!

I would like ot load html page and take some information, creating C# application.



I received the following error:


An unhandled exception of type 'System.NullReferenceException' occurred in CursorMove.exe

Additional information: Object reference not set to an instance of an object.


Could you please help?

What I have tried:

...
C#
HtmlAgilityPack.HtmlWeb web = new HtmlAgilityPack.HtmlWeb();

string url = "https://www.site.com";

HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();

doc = web.Load(url);

var nodes = doc.DocumentNode.SelectSingleNode("//span[@ class='a-size-medium']");

var inner = doc.DocumentNode.SelectNodes("//div[@ class='sc-list-item-content']");

foreach (HtmlAgilityPack.HtmlNode item44 in inner) (//The error is in this line//)
{
  cnt_array[cnt] = item44.InnerText;
  textBox2.Text = cnt_array[5];
  cnt++;
}

...
Posted
Updated 14-Dec-19 1:03am
v4
Comments
F-ES Sitecore 2-Jul-18 6:20am
   
There is probably an issue loading the url you are giving it, or an issue with the html at that url.
Kornfeld Eliyahu Peter 2-Jul-18 15:30pm
   
On what line the error?
Member 12268183 3-Jul-18 9:17am
   
Here is my code (I have marked the line with the error. The The application is built successfully but the error occures during the execution):

var nodes = doc.DocumentNode.SelectSingleNode("//span[@ class='a-size-medium']");

var inner = doc.DocumentNode.SelectNodes("//div[@ class='sc-list-item-content']");

foreach (HtmlAgilityPack.HtmlNode item44 in inner) (//The error is in this line//)
{
cnt_array[cnt] = item44.InnerText;
textBox2.Text = cnt_array[5];
cnt++;
}
Kornfeld Eliyahu Peter 3-Jul-18 9:41am
   
Which means that 'inner' is null.
But that you should know if you would use the debugger...
There are two possible reasons:
1. You really have a space (' ') between '@' and 'class'
2. There is no div with such class you are looking for - see the content of the page
Mike V Baker 3-Jul-18 10:39am
   
From the docs:
Returns:
An HtmlAgilityPack.HtmlNodeCollection containing a collection of nodes matching the HtmlAgilityPack.HtmlNode.XPath query, ***or null if no node matched*** the XPath expression.

Have you verified that your SelectNodes function should return as least some nodes? Verified the syntax of the search? Case sensitivity of the search criteria?
Since the doc says it can return null you should have a check if (inner != null) before you try to use it.
HTH, Mike
Richard Deeming 3-Jul-18 11:11am
   
Good catch. That seems like a bad API design - you'd expect it to return an empty collection if there were no matches.
Member 12268183 4-Jul-18 5:08am
   
I see that inner returns null. This node is working when I download the page in my computer. When I would like to use the information from the web page, inner returns null. How to resolve this problem?
Member 12268183 4-Jul-18 11:03am
   
I tried with another web page and everything is OK. Is there any special with amazon pages?
Richard Deeming 4-Jul-18 13:24pm
   
It sounds like the element is being created by Javascript, and is not actually part of the source of the page sent back from the server.

You'll probably need to see whether Amazon offer an API to let you get the information without trying to "scrape" the page.
Member 12268183 6-Jul-18 2:42am
   
Thank you!
How I can see if Amazon offer an API allowing the get the information?
Richard Deeming 6-Jul-18 6:16am
   
Ask Amazon.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)




CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900