Problem in consecutive node checking program with consecutive attributes?

Question

0.00/5 (No votes)

See more:

I'm trying to find some consecutive nodes <xref ref-type="bibr" rid="ref...">...</xref> (when there are 3 or more) in a file that are separated by a comma or space and write them to a log file.

NOTE: The consecutive nodes that I'm trying to identify should have their respective rid values incremented by +1 minus the text ref. Here is small sample file https://codeshare.io/5wOjlK

and the desired output is

XML

<xref ref-type="bibr" rid="ref2">[2]</xref>, <xref ref-type="bibr" rid="ref3">[3]</xref>, <xref ref-type="bibr" rid="ref4">[4]</xref>

XML

<xref ref-type="bibr" rid="ref11">[11]</xref>, <xref ref-type="bibr" rid="ref12">[12]</xref> <xref ref-type="bibr" rid="ref13">[13]</xref>

here is the code that I'm using https://codeshare.io/ar6mPA But it shows a dtd not found type error, how do I ignore that..I tried using the below code

What I have tried:

C#

FileStream xmlStream = new FileStream(@"D:\test\12345.XML", FileMode.Open, FileAccess.Read);
XmlReaderSettings settings = new XmlReaderSettings();
settings.XmlResolver = null;
settings.ProhibitDtd = false;
XmlReader reader = XmlTextReader.Create(xmlStream, settings);
XmlDocument doc = new XmlDocument();
doc.Load(reader);

instead of

C#

XmlDocument doc = new XmlDocument();
doc.PreserveWhitespace = true;
doc.Load(@"D:\test\12345.XML");

But now it is showing only the first match...I'm confused.. Can anyone help please...

Posted 28-Feb-18 13:51pm

Member 12692000

Updated 28-Feb-18 21:03pm

Maciej Los

v3

Add a Solution

1 solution

Add a Solution

Add your solution here

Treat my content as plain text, not as HTML

Preview 0

…

Existing Members

Sign in to your account

...or Join us

Download, Vote, Comment, Publish.

Your Email
Password
Forgot your password?

Your Email
This email is in use. Do you need your password?
Optional Password

I have read and agree to the Terms of Service and Privacy Policy
Please subscribe me to the CodeProject newsletters

When answering a question please:

Read the question carefully.
Understand that English isn't everyone's first language so be lenient of bad spelling and grammar.
If a question is poorly phrased then either ask for clarification, ignore it, or edit the question and fix the problem. Insults are not welcome.
Don't tell someone to read the manual. Chances are they have and don't get it. Provide an answer or move on to the next question.

Let's work to help developers, not make them feel stupid.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Maciej Los · Answer 1 · 2018-02-28T21:03:00

I prefer to use XDocument class[^] which is very "flexible" when there's a need to implement custom search method. See:

C#

XmlReaderSettings settings = new XmlReaderSettings();
settings.DtdProcessing = DtdProcessing.Parse;

XDocument xdoc = XDocument.Load(XmlReader.Create("fullfilename.xml", settings));

var cons = xdoc.Descendants("xref")
    .GroupBy(x=>x.Parent)
    .Select(grp=> new
        {
            Parent = grp.Key,
            ConsecutiveNodes = grp.Select((n, i)=> new
                {
                    Index = i+1,
                    Node = n
                }),
            Count = grp.Count()
        })
    .ToList();

Console.WriteLine("3 or more consecutive nodes:");
foreach(var o in cons)
{
    if (o.Count>2)
    {
        Console.WriteLine("{0}", new string('=', 30));
        Console.WriteLine("Found in: {0} ... {1}", o.Parent.ToString().Substring(0,15), o.Parent.ToString().Substring(o.Parent.ToString().Length-15,15));
        Console.WriteLine("{0}", new string('-', 50));
        foreach (var c in o.ConsecutiveNodes)
        {
            //Console.WriteLine("{0}", c.Node);
            Console.WriteLine("Original rid value [{0}] will be replaced with [{1}]", c.Node.Attribute("rid").Value, c.Index);
            c.Node.Attribute("rid").Value = c.Index.ToString();
        }
    }
}

Above code displays:

3 or more consecutive nodes:
==============================
Found in: <p>In this stud ... 15]</xref>.</p>
--------------------------------------------------
Original rid value [ref2] will be replaced with [1]
Original rid value [ref3] will be replaced with [2]
Original rid value [ref4] will be replaced with [3]
Original rid value [ref20] will be replaced with [4]
Original rid value [ref3] will be replaced with [5]
Original rid value [ref15] will be replaced with [6]
==============================
Found in: <p>The measurin ... cattering..</p>
--------------------------------------------------
Original rid value [ref11] will be replaced with [1]
Original rid value [ref12] will be replaced with [2]
Original rid value [ref13] will be replaced with [3]
Original rid value [ref4] will be replaced with [4]
Original rid value [T2] will be replaced with [5]

For further information, please see:
XDocument.Load Method (XmlReader) (System.Xml.Linq)[^]
XmlReaderSettings.DtdProcessing Property (System.Xml)[^]

Feel free to change code to your needs. Good luck!