Click here to Skip to main content
15,881,862 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
Hi ,

small issue in this code..

XML
<ul>
    <li><u><em><strong>Hi    </strong></em></u></li>
    <li><u><em><strong>Hello    </strong></em></u></li>
    <li><u><em><strong>How r u.      </li>
</ul>




In this code for "Third list item" there is no ending tag(</strong></em></u>)

so what i want is, i want to remove (<u><em><strong>) these tags for "Third List Item"



Thanks in advance..
Posted
Updated 20-Jan-11 23:34pm
v4
Comments
Sandeep Mewara 21-Jan-11 4:24am    
Ok. So what have you tried so far?

OK - this is a bit of algorithmic fun and I'll tell you how I'd go about it, but you'll have to figure the code out yourself.

You need to combine a forward reading parser with a recursive object or functional call. On every open tag you need to go one level further of recursion, storing the text from that point forward in your new object (or stack parameter). Terminate each level of recursion on reading any closing tag. If the closing tag matches the starting tag include both tags in the response, otherwise just return the text between the tags and the closing one.

This way, you either get the text with matching tags either side or the text on its own which gives you your requirement. You're removing the starting tag where there is no closing tag. Hope that makes some sort of sense...
 
Share this answer
 
v2
Comments
Espen Harlinn 21-Jan-11 5:03am    
5+ sensible solution
Hemraj Gujar 21-Jan-11 5:17am    
perfect logic
Sergey Alexandrovich Kryukov 21-Jan-11 16:59pm    
Ron, I think this is *much* better answer - my 5.
Maybe, for such a forward-only parser XmlReader should fit nicely, after some thinking. It will take some time to implement, but not really hard, with much more better supportability, than with Regex.
--SA
My complete work for you in a free time.

String HTMLStr = "<ul><li><u><em><strong>Hi    </strong></em></u></li><li><u><em><strong>Hello    </strong></em></u></li>    <li><u><em><strong>How r u.      </li></ul>";
        Regex regex = new Regex("\\<[^\\>]*\\>");
        MatchCollection collection = regex.Matches(HTMLStr);
        List<coll> list = new List<coll>();
        foreach (Match match in collection)
        {
            list.Add(new coll() { POS = match.Index, TAG = match.Value.ToString() });
        }

        for (int i = 0; i < collection.Count / 2; i++)
        {
            bool temp = false;
            foreach (coll col in list)
            {
                if (!col.TAG.Contains("/"))
                {
                    foreach (coll col1 in list)
                    {
                        if (col1.TAG.Contains("/"))
                        {
                            if (col.TAG.Replace(" ", "") == col1.TAG.Replace("/", "").Replace(" ", ""))
                            {
                                list.Remove(col);
                                list.Remove(col1);
                                temp = true;
                                break;
                            }
                        }
                    }
                }
                if (temp)
                    break;
            }
            
        }
        foreach (coll col in list)
        {
            HTMLStr = HTMLStr.Remove(col.POS, col.TAG.Length);
        }


Class coll

C#
public class coll
{
public int POS {get;set;}
public string TAG {get;set;}
}
 
Share this answer
 
v2
Comments
venkatrao palepu 21-Jan-11 8:57am    
Thank you so much for spending your valuble time for me...
Hiren solanki 21-Jan-11 9:00am    
You are welcome. I thought to help you in anyway as I was free.
Hiren solanki 21-Jan-11 9:54am    
I had forgot to include class too.

I've updated my answer accordingly.
venkatrao palepu 21-Jan-11 10:07am    
Thank you so much . its too helpful for me....
Nageshwarraok 24-Jan-11 1:38am    
Hi Hiren Solanki,
Thank you for your valuable code i will appreciate your logic.small suggestion :
Your code is perfect till you collect tags which are not having ending tags. But problem occurs when u remove opening tags which are not having closing tags.
Problem occurs when, every time you remove an item from string. you have to update position accordingly on each remove. like:

int length = 0;
foreach (coll col in list)
{
HTMLStr = HTMLStr.Remove((col.POS - length) , col.TAG.Length);
length =length+ col.TAG.Length;
}


Thank you

Regards
NageshwarRao Kolagani
You might want to consider using the HTMLAgilityPack:

HTMLAgilityPack at Codeplex[^]

I don't know for sure if there's a method in that library that fixes missing tags HTML, but you might want to check it out.
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900