Click here to Skip to main content
15,886,689 members
Please Sign up or sign in to vote.
1.00/5 (1 vote)
See more:
Hello Everyone,
I have one issue,I have one HTML file , from that I want to select only <body>#information</body> and put it on another file without using any third party dll.

please anyone provide me solution on it.

Thanks,
Samadhan
Posted
Updated 21-Nov-14 3:19am
v2
Comments
Thanks7872 21-Nov-14 8:34am    
What have you tried so far?
samadhan_kshirsagar 21-Nov-14 8:42am    
I have tried by using HtmlAgility pack. but i want without using any DLL.
Thanks7872 21-Nov-14 9:00am    
Have you tried anything without DLL? If not,why? If yes, where is the code? What are the issues?
samadhan_kshirsagar 21-Nov-14 9:10am    
I have tried by using regex (Regular expression)for replacing contents.but for selecting particular <Body> tag i have used HTMLAgility pack dll.
Maciej Los 21-Nov-14 11:03am    
What exactly do you want to 'extract'?

Please, read my comment to the question.

Here is very interesting class: Convert HTML to Text[^]
 
Share this answer
 
Comments
BillWoodruff 21-Nov-14 12:41pm    
+5 That's a very good resource to know about.
Maciej Los 21-Nov-14 14:40pm    
Thank you, Bill ;)
Manas Bhardwaj 21-Nov-14 14:51pm    
Yes, +5!
Maciej Los 21-Nov-14 14:52pm    
Thank you, Manas ;)
Hi,

you could extract the body content like this:

C#
string ExtractBody(string html)
{
    int bodyBegin = html.IndexOf("<body>") + "<body>".Length;
    int bodyEnd = html.IndexOf("</body>");
    int bodyLength = bodyEnd - bodyBegin;

    return html.Substring(bodyBegin, bodyLength);
}


Happy coding,
Stephan
 
Share this answer
 
v2
Use a regex:
<body>.*</body>
Should do it.
 
Share this answer
 
Comments
TheRealSteveJudge 21-Nov-14 9:50am    
Using Regex will definitely not work.
HTML is not a regular language.

Why don't you want to use a library like HTML agility pack?
BillWoodruff 21-Nov-14 10:06am    
Have you personally tried using the RegEx suggested by OriginalGriff on a string containing the text of a properly formatted HTML page ? If not, why are you claiming it will not work ?
Maciej Los 21-Nov-14 11:30am    
HTML is not a regular language. Are we talking about the same language?
TheRealSteveJudge 21-Nov-14 10:20am    
I tried although I know it will fail...
OriginalGriff 21-Nov-14 10:26am    
Strangely, I tried as well before I posted this - with the source of a Codeproject page, in Expresso.

To my complete lack of surprise, it worked perfectly...

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900