Click here to Skip to main content
15,885,782 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
I am trying to capture the following values from the html below:

--> DURACELL
--> 2345268
--> 5000394002906

XML
<div id="productDescription">
    <ul>
        <li>
            <strong>Manufacturer:</strong>
            <a href="http://uk.Company.com/duracell">
                DURACELL
            </a>
        </li>
        <li>
            <strong>Order Code:</strong>
            2345268
        </li>
        <li>
            <strong>Manufacturer Part No</strong>
            5000394002906
        </li>
    </ul>
</div>


The code below will get the data, but all the formatting is still present (tabs, line divisions etc). I can see from the HAPExplorer that the values can be captured on their own. Therefore I know that there must be a better solution to mine.

XML
IEnumerable<HtmlNode> liContent = document.DocumentNode.SelectNodes("//div[@id='productDescription']/ul/li");

foreach (HtmlNode l in liContent)
{
    Console.WriteLine("InnerText: " + l.InnerText);
}


Thanks.
Posted
Comments
ArunRajendra 24-Mar-15 23:43pm    
Might you need to try Value or text.

1 solution

Well judging from the provided XML content it seems that you want to retrieve the <li> element's last presented inner text.
In that case you can use the following:
C#
foreach (HtmlNode l in liContent)
    for (int i = l.ChildNodes.Count - 1; i >= 0; i--)
    {
        string lastInnerText = l.ChildNodes[i].InnerText.Trim();
        if (!string.IsNullOrEmpty(lastInnerText))
        {
            Console.WriteLine("InnerText: " + lastInnerText);
            break;
        }
    }
 
Share this answer
 
Comments
Andrw_S 25-Mar-15 8:06am    
Thank you! Work's a treat.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900