Click here to Skip to main content
15,884,627 members
Articles / Programming Languages / C#
Article

HtmlDocument Introspection in Treeview

Rate me:
Please Sign up or sign in to vote.
4.63/5 (9 votes)
8 Feb 2009CPOL2 min read 38.4K   1.2K   37   7
HtmlDocument Introspection in Treeview showing html , form , link ,images and css

HtmlIntrospection

Introduction

After my article XML Introspection and TreeView , I take a look about the webbrowser component. and I discover this component have a property HtmlDocument (webBrowser1.Document). This is a good way to get info of the webpage , without parsing Html, the webbrowser component make it for you.  

Background 

  I want to expose to you here, a little application showing a webpage ( in the screenshoot the codeproject page ) and get information in the HtmlDocument ( tree of HtmlElement ).

Showing theses in a treeview and show in right a preview , and display property in a propertygrid ( right and bottom ). 

Using the code 

Enter an URL in the text entry  and press the Go button.  

When the web page is loaded, then the Event Handler webBrowser1_DocumentCompleted is call.

So we catch all html tag of body , forms , links, Images, and CSS

For each type there's a method:
private void FillTree(HtmlElement hElmFather, TreeNodeHtmlElm t,TreeNodeHtmlElm.TypeNode type) 
private void FillTreeForm(HtmlDocument doc, TreeNodeHtmlElm t) { 
System.Collections.IEnumerator en = doc.Forms.GetEnumerator();
while (en.MoveNext())
{
    FillTree((HtmlElement)en.Current,t,TreeNodeHtmlElm.TypeNode.Form);
}
private void FillTreeLink(HtmlDocument doc, TreeNodeHtmlElm t) 
// To find all link : string textToAdd = e.GetAttribute("href"); where e is a HtmlElement
private void FillTreeImage(HtmlDocument doc, TreeNodeHtmlElm t) 
// To find all image : string textToAdd = e.GetAttribute("src");
At each time we use a tempory array to not concider same img or link.
private void FillTreeCss(HtmlDocument doc, TreeNodeHtmlElm t)
For the CSS, the test is : 
if(e.TagName.ToLower() == "link")
           {
               if (e.GetAttribute("rel").ToLower() == "stylesheet")

So, the information are structured in a treeview, each element of treeview is a class TreeNodeHtmlElm : TreeNode.

Points of Interest 

I found interesting to explore a webpage in this way, a different way to see one.

I have a problem with tree view because the text of the node a too huge, and then the application is really slow when tooltips appear so I limit the size of 100:  

public TreeNodeHtmlElm(HtmlElement elm,TypeNode t) : base()
{
    type = t;
    mHtmlElement = elm;
    try
    {
        if (elm.OuterText == null || elm.OuterText == "")
        {
            Text = elm.OuterHtml;
        }
        else
        {
            if (elm.OuterText.Length > 100)
            {
                Text = elm.OuterText.Substring(0, 100);
            }
            else
            {
                Text = elm.OuterText;
            }
        }
    }
    catch (Exception e)
    {
        Text = "";
    }

If you click on the treenode, the application make a preview a the piece of html, in the windows a the right top. 

You can right click, and the there's a content menu , and you can save ( SaveTreeNodeHtml ) the Text of the subnodes.

It don't work for image , it doesn't save image only url of image, it could be inteesting in another version to download and save the image , the same for the CSS.

Please take a look of my different page

http://www.cmb-soft.com/ a css editor

My homepage http://vidalcharles.free.fr/

I'm looking for a job, if anybody have a job proposition please email me at charles.vidal(at)gmail.com thanks.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer (Senior) http://www.cmb-soft.com/
France France
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
GeneralIt is a great Job Pin
Collus27-May-12 20:02
Collus27-May-12 20:02 
GeneralMy vote of 5 Pin
Manoj Kumar Choubey26-Feb-12 21:12
professionalManoj Kumar Choubey26-Feb-12 21:12 
GeneralThank! Pin
thansautk18-Jun-09 16:54
thansautk18-Jun-09 16:54 
GeneralThanks !! Pin
Paw Jershauge9-Feb-09 2:29
Paw Jershauge9-Feb-09 2:29 
GeneralRe: Thanks !! Pin
zebulon750189-Feb-09 6:09
zebulon750189-Feb-09 6:09 
GeneralRe: Thanks !! Pin
Paw Jershauge9-Feb-09 6:49
Paw Jershauge9-Feb-09 6:49 
GeneralRe: Thanks !! Pin
zebulon750189-Feb-09 8:38
zebulon750189-Feb-09 8:38 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.