How to Remove opening tags if ending tags are not there ?

Question

0.00/5 (No votes)

See more:

Hi ,

small issue in this code..

XML

<ul>
    <li><u><em><strong>Hi    </strong></em></u></li>
    <li><u><em><strong>Hello    </strong></em></u></li>
    <li><u><em><strong>How r u.      </li>
</ul>

In this code for "Third list item" there is no ending tag()

so what i want is, i want to remove () these tags for "Third List Item"

Thanks in advance..

Posted 20-Jan-11 22:11pm

venkatrao palepu

Updated 20-Jan-11 23:34pm

v4

Add a Solution

Comments

Sandeep Mewara 21-Jan-11 4:24am

Ok. So what have you tried so far?

3 solutions

Solution 2

You might want to consider using the HTMLAgilityPack:

HTMLAgilityPack at Codeplex[^]

I don't know for sure if there's a method in that library that fixes missing tags HTML, but you might want to check it out.

Posted 20-Jan-11 23:30pm

#realJSOP

Add a Solution

Add your solution here

Treat my content as plain text, not as HTML

Preview 0

…

Existing Members

Sign in to your account

...or Join us

Download, Vote, Comment, Publish.

Your Email
Password
Forgot your password?

Your Email
This email is in use. Do you need your password?
Optional Password

I have read and agree to the Terms of Service and Privacy Policy
Please subscribe me to the CodeProject newsletters

When answering a question please:

Read the question carefully.
Understand that English isn't everyone's first language so be lenient of bad spelling and grammar.
If a question is poorly phrased then either ask for clarification, ignore it, or edit the question and fix the problem. Insults are not welcome.
Don't tell someone to read the manual. Chances are they have and don't get it. Provide an answer or move on to the next question.

Let's work to help developers, not make them feel stupid.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Rob Philpott · Accepted Answer · 2011-01-20T23:00:00

OK - this is a bit of algorithmic fun and I'll tell you how I'd go about it, but you'll have to figure the code out yourself.

You need to combine a forward reading parser with a recursive object or functional call. On every open tag you need to go one level further of recursion, storing the text from that point forward in your new object (or stack parameter). Terminate each level of recursion on reading any closing tag. If the closing tag matches the starting tag include both tags in the response, otherwise just return the text between the tags and the closing one.

This way, you either get the text with matching tags either side or the text on its own which gives you your requirement. You're removing the starting tag where there is no closing tag. Hope that makes some sort of sense...

Hiren solanki · Accepted Answer · 2011-01-21T01:37:00

Solution 3

My complete work for you in a free time.

String HTMLStr = "<ul><li><u><em><strong>Hi    </strong></em></u></li><li><u><em><strong>Hello    </strong></em></u></li>    <li><u><em><strong>How r u.      </li></ul>";
        Regex regex = new Regex("\\<[^\\>]*\\>");
        MatchCollection collection = regex.Matches(HTMLStr);
        List<coll> list = new List<coll>();
        foreach (Match match in collection)
        {
            list.Add(new coll() { POS = match.Index, TAG = match.Value.ToString() });
        }

        for (int i = 0; i < collection.Count / 2; i++)
        {
            bool temp = false;
            foreach (coll col in list)
            {
                if (!col.TAG.Contains("/"))
                {
                    foreach (coll col1 in list)
                    {
                        if (col1.TAG.Contains("/"))
                        {
                            if (col.TAG.Replace(" ", "") == col1.TAG.Replace("/", "").Replace(" ", ""))
                            {
                                list.Remove(col);
                                list.Remove(col1);
                                temp = true;
                                break;
                            }
                        }
                    }
                }
                if (temp)
                    break;
            }
            
        }
        foreach (coll col in list)
        {
            HTMLStr = HTMLStr.Remove(col.POS, col.TAG.Length);
        }

Class coll

C#

public class coll
{
public int POS {get;set;}
public string TAG {get;set;}
}

Posted 21-Jan-11 1:37am

Hiren solanki

Updated 21-Jan-11 3:54am

v2

Comments

venkatrao palepu 21-Jan-11 8:57am

Thank you so much for spending your valuble time for me...

Hiren solanki 21-Jan-11 9:00am

You are welcome. I thought to help you in anyway as I was free.

Hiren solanki 21-Jan-11 9:54am

I had forgot to include class too.

I've updated my answer accordingly.

venkatrao palepu 21-Jan-11 10:07am

Thank you so much . its too helpful for me....

Nageshwarraok 24-Jan-11 1:38am

Hi Hiren Solanki,
Thank you for your valuable code i will appreciate your logic.small suggestion :
Your code is perfect till you collect tags which are not having ending tags. But problem occurs when u remove opening tags which are not having closing tags.
Problem occurs when, every time you remove an item from string. you have to update position accordingly on each remove. like:

int length = 0;
foreach (coll col in list)
{
HTMLStr = HTMLStr.Remove((col.POS - length) , col.TAG.Length);
length =length+ col.TAG.Length;
}

Thank you

Regards
NageshwarRao Kolagani

Hiren solanki 24-Jan-11 1:41am

I am removing from list only. I am not removing from original string. so the position will be the same when it first found. so that it will be easy to remove it from string, Sir.

Nageshwarraok 24-Jan-11 1:49am

Hi Hiren Solanki,

But following code removing from original string only.

foreach (coll col in list)
{
HTMLStr = HTMLStr.Remove(col.POS, col.TAG.Length);
}

Thank you

Regards

NageshwarRao Kolagani

Hiren solanki 24-Jan-11 1:50am

Oh I missed that thing.

Thanks sir. I will update my code accordingly.

Nageshwarraok 24-Jan-11 1:54am

Hi Hiren Solanki,

Thank you so much. your code/logic helps me a lot.

Thank You,

Regards
NageshwarRao Kolagani.

Hiren solanki 24-Jan-11 1:55am

Glad it helped you,
If you correct my code with your following suggestion then please post me too it.

Thanks.
Hiren Solanki

Nageshwarraok 24-Jan-11 2:12am

I have not changed your code much. Only the part of your code
foreach (coll col in list)
{
HTMLStr = HTMLStr.Remove(col.POS, col.TAG.Length);
}
changed to
int tagLength = 0;
foreach (coll col in list)
{
HTMLStr = HTMLStr.Remove((col.POS - tagLength), col.TAG.Length);
tagLength = tagLength + col.TAG.Length;
}
I am updating position on each removal of tag from the original string.

Thank you

Regards
NageshwarRao Kolagani