Click here to Skip to main content
15,890,282 members
Please Sign up or sign in to vote.
3.00/5 (3 votes)
See more:
I want to replace all the HTML tags in string with space. What regular expression should I use for that?

Input:
HTML
<td align="right"><br></td></tr><tr><td colspan="2">&nbsp;</td></tr><tr><td colspan="2">Hi [FNm] [LNm]:<br><br>Thanks for downloading.

Output:
Hi [FNm] [LNm]: Thanks for downloading.

Help me to design regular expression for this.
C#
string input='<td align="right"><br></td></tr><tr><td colspan="2">&nbsp;</td></tr><tr><td colspan="2">Hi [FNm] [LNm]:<br><br>Thanks for downloading.";

string atr=Regex.Replace(input,[what regex pattern should I use]," ");


Also I want to remove '\n', '\r' n all of those keywords that are used with HTML
Posted
Updated 1-Apr-14 18:45pm
v5
Comments
Ami_Modi 1-Apr-14 7:28am    
Thanks Andre for editing my question. But can you please help me with solution

 
Share this answer
 
Try this:
C#
string atr = Regex.Replace(input, "</?.+?>", " ");

This replaces every tag (start tags and end tags) with a space.
 
Share this answer
 
v2
Comments
Ami_Modi 1-Apr-14 7:45am    
Its not happening so. in fact it is placing space between every character
Thomas Daniels 1-Apr-14 7:47am    
Yes, I'm sorry. I updated my answer. The reason was that the answer editor thought that it was a HTML tag, and it was not displayed. Now it is displayed. Try it again.
Ami_Modi 1-Apr-14 8:01am    
Thank you very much.. But my question is that is there any exception for this regex?
Thomas Daniels 1-Apr-14 8:09am    
No, it removes all tags. Note that the HTML must be well-formed, otherwise it wouldn't work.
One more answer I got. If anybody referring this question can try this:

"<[^>]*>"
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900