Click here to Skip to main content
15,884,629 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
I have been researching XElement for a large data import. The data is returned from an API in record format like this:

XML
<record>
  <datestored>2020-09-29</datestored>
  <excludefromtranscripts>0</excludefromtranscripts>
  <gradescale_name>Secondary Numeric Scale</gradescale_name> 
  <gpa_addedvalue>0</gpa_addedvalue>
  <ispotentialcrhrsfromgb>0</ispotentialcrhrsfromgb> 
  <excludefromgraduation>0</excludefromgraduation>
  <transaction_date>2020-09-29 10:55:49.586</transaction_date>
  <absences>0</absences><local_id>123456</local_id>
  <course_number>XAM200</course_number>
  <course_name>ENGLISH REGENTS</course_name>
  <earnedcrhrs>0</earnedcrhrs>
  <credit_type>EXAM</credit_type>
  <gradereplacementpolicy_id>0</gradereplacementpolicy_id>
  <codeid>R8</codeid>
  <schoolname>Some School Name here</schoolname>
  <termid>3000</termid>
  <grade_level>12</grade_level>
  <grade>E</grade>
  <excludefromgradesuppression>0</excludefromgradesuppression>
  <gpa_points>0</gpa_points>
  <excludefromgpa>1</excludefromgpa>
  <excludefromclassrank>1</excludefromclassrank>
  <sectionid>0</sectionid>
  <percent>0</percent>
  <excludefromhonorroll>1</excludefromhonorroll>
  <potentialcrhrs>0</potentialcrhrs>
  <tardies>0</tardies>
  <replaced_dcid>0</replaced_dcid>
  <gpa_custom2>0</gpa_custom2>
  <isearnedcrhrsfromgb>0</isearnedcrhrsfromgb>
  <schoolid>6</schoolid>
</record>


Since these structures and record sets are so large and not well formatted, xPath, DOM and ADO go to sleep almost entirely chewing on these datasets. This RDMS data is literally vomited at me from the source system and the 7 tables are being sent back as individual documents Asynchronously, of course.

These structure vary a little??? what I am concerned with is this:
new XElement("isearnedcrhrsfromgb", iNode["isearnedcrhrsfromgb"].InnerText)
and naturally going forward I would love that I not have to touch this code to bring in a file that might differ even more.

This is all paged data, so one query might make 45 calls to the API at 2,000 records per request. I have one table that returns 420,000 records with a pagesize set at 10,000 it makes 43 trips to the data source. This code block is reconstituting the data table after a good response from the RestResponse.

I am searching for a way that I can make these columns to where I can drive the names of fields through an XML driver file that I constructed, something similar to this:

string[] mappedfields = ParserMap["ResopnseData"].InnerText.Split(',')
XElement[] MappedData = new XElement[mappedfields.Length]
int idx = -1
foreach (string MF in mappedfields)
{ XElement[++idx] = new XElement(MF, iNode[MF].InnerText) }
XElement DataResponse = new XElement(MainElementName, MappedData)

I think you get the gist of what I am looking to do?!

What I have tried:

C#
XElement DataResponse = new XElement(MainElementName, 
     new XAttribute("MPDisplayName",   iNode["storecode"].InnerText),
     new XAttribute("excludefromGPA",  iNode["excludefromgpa"] == null ? "False" : 
                         (iNode["excludefromgpa"].InnerText == "1").ToString()),
     new XAttribute("CalendarEventID", iNode["termid"].InnerText),
     new XElement("LocalId",      iNode["local_id"].InnerText),
     new XElement("SectionID",    iNode["sectionid"].InnerText),
     new XElement("CourseNumber", iNode["course_number"].InnerText),
     new XElement("CourseName",   iNode["course_name"] == null ? "" : 
                                  iNode["course_name"].InnerText),
     new XElement("CreditType",   iNode["credit_type"] == null ? "" : 
                                  iNode["credit_type"].InnerText),
     new XElement("Grade",        iNode["grade"] == null ? "" : 
                                  iNode["grade"].InnerText),
     new XElement("GradeLevel",   iNode["grade_level"] == null ? "" : 
                                  iNode["grade_level"].InnerText),
     new XElement("EarnedCrHrs",  iNode["earnedcrhrs"] == null ? "" : 
                                  iNode["earnedcrhrs"].InnerText),
     new XElement("GPA_Points",   iNode["gpa_points"] == null ? ""  : 
                                  iNode["gpa_points"].InnerText),
     new XElement("Percent",      iNode["percent"] == null ? ""     : 
                                  iNode["percent"].InnerText),
     new XElement("PeriodAtttendance", 
                   new XElement("Absent", iNode["absences"].InnerText),
                   new XElement("Tardy",  iNode["tardies"].InnerText)),
     new XElement("Properties", 
         new XElement("StudentID",      iNode["studentid"].InnerText),
         new XElement("GPA_AddedValue", 
                    iNode["gpa_addedvalue"] == null ? "False" : 
                    (iNode["gpa_addedvalue"].InnerText == "1").ToString()),
         new XElement("isPotentialcrhrsfromGB", 
         iNode["ispotentialcrhrsfromgb"] == null ? "False" :
         (iNode["ispotentialcrhrsfromgb"].InnerText == "1").ToString()),
         new XElement("excludefromGraduation", 
         iNode["excludefromgraduation"] == null ? "False" : 
         (iNode["excludefromgraduation"].InnerText == "1").ToString()),
         new XElement("excludefromClassRank", 
         iNode["excludefromclassrank"] == null ? "False" : 
         (iNode["excludefromclassrank"].InnerText == "1").ToString()),
         new XElement("excludefromHonorRoll", 
         iNode["excludefromhonorroll"] == null ? "False" :
         (iNode["excludefromhonorroll"].InnerText == "1").ToString()),
         new XElement("excludefromGradeSuppression", 
         iNode["excludefromgradesuppression"] == null ? "False" :
         (iNode["excludefromgradesuppression"].InnerText == "1").ToString()),
         new XElement("isEarnedcrhrsfromGB", 
         iNode["isearnedcrhrsfromgb"] == null ? "False" : 
         (iNode["isearnedcrhrsfromgb"].InnerText == "1").ToString()),
         new XElement("DateStored", 
            iNode["datestored"] == null ? "" :
            DateTime.Parse(iNode["datestored"].InnerText).ToShortDateString()),
         new XElement("TransactionDate", 
            iNode["transaction_date"] == null ? "" :
            DateTime.Parse(iNode["transaction_date"].InnerText).ToString())),
         new XElement("Comments", 
             new XElement("Comment", 
             iNode["comment_value"] == null ? "" :
             iNode["comment_value"].InnerText.Replace("\r\n", ". ").Replace("\n", ". "))));

XmlNode element = 
    RespDoc.OwnerDocument.ReadNode(DataResponse.CreateReader()) as XmlNode;
RespDoc.AppendChild(element);

And this rips through 2,000 records in nano-seconds!
Posted
Updated 1-Jun-21 11:45am

Create an entity that matches the structure, and then use the built-in XML deserialize functionality in .Net. Deserializing only requires a few lines of code to implement once you'd created an appropriate entity in which to consume the data.It will also handle the possibility of null/empty data if you build your entity correctly.
 
Share this answer
 
v2
Comments
Member 13735228 1-Jun-21 10:55am    
Can you expand on the Deserializer? Sounds intriguing!
#realJSOP 3-Jun-21 5:11am    
It's too big of a topic to cover in QA. Google "c# XML serialize/deserialize".
Member 13735228 1-Jun-21 11:00am    
There is so much null checking because the source data may or may not return a result for any one of the elements in question. Besides, when 5 API returns run through this, you never know what might return!?!?!?!?!
What's wrong with something like this?
C#
string[] mappedfields = ParserMap["ResopnseData"].InnerText.Split(',');
XElement DataResponse = new XElement(MainElementName);
foreach (string MF in mappedfields)
{
    DataResponse.Add(new XElement(MF, iNode[MF].InnerText));
}
 
Share this answer
 
Comments
Member 13735228 1-Jun-21 10:54am    
The problem would be to create the XAttributes and the embedded XElements.
Richard Deeming 1-Jun-21 10:56am    
Well, perhaps if you explain how you'd know whether the mapped field was an element, an attribute, or a nested element, someone might be able to help you. :)
Member 13735228 1-Jun-21 12:37pm    
I'm making this file from another file that is a mere data dump. As I bring the data in from the RestSharp Response (which is deserialized from a JSon array), I am passing that response document into an internal object that then is attempting to parse the data into a well formatted XML DOM Document that xPath can place into a flattened structure for reporting. All I'm trying to do is transform this data into something xPath can navigate. Because, if I leave this document as a huge data blob (117MB), DOM goes comatose when attempting to parse the data. I'm just looking to data drive the import so that can add Fields for collection or take away fields when performance is required.
Member 13735228 1-Jun-21 17:45pm    
I owe you an apology... Sorry for my deft response.
{ foreach (XmlNode iNode in xdoc.ChildNodes[0].ChildNodes)
              { string MainElementName = InList(getSchedule.Replace("GRADES.", ""), new string[] { "ReportCard", "ProgressReports" }) ? "MPGrades" : getSchedule.Replace("QUERY.", "");
                XElement DocDeserializer = new XElement(MainElementName, "Something");
                XmlElement Deserializer = (XmlElement)Parameters.SelectSingleNode($"PaserMap/SRC.Rules/Deserializer/{MainElementName}");
                XElement[] MainElement = new XElement[1];
                XAttribute[] XAttributes = new XAttribute[1];
                if (Deserializer != null && Deserializer["XElements"].HasChildNodes)
                { bool hasAttributes = Deserializer["XAttributes"] != null && Deserializer["XAttributes"].HasChildNodes;
                  MainElement = new XElement[Deserializer["XElements"].ChildNodes.Count];
                  int idx = -1;
                  if (hasAttributes)
                  { XAttributes = new XAttribute[Deserializer["XAttibutes"].ChildNodes.Count];
                    foreach (XmlElement XAttribute2Load in Deserializer["XAttibutes"])
                    { XAttributes[++idx] = new XAttribute(XAttribute2Load.Name, iNode[XAttribute2Load.InnerText].InnerText); } }

                  idx = -1;
                  foreach (XmlElement XElement2Load in Deserializer["XElements"])
                  { if (XElement2Load.HasChildNodes)
                    { XElement[] subXElements = new XElement[XElement2Load.ChildNodes.Count];
                      int subidx = -1;
                      foreach (XmlElement subEX in XElement2Load)
                      { if (subEX.HasAttribute("TestValue"))
                          subXElements[++subidx] = new XElement(subEX.Name, iNode[subEX.InnerText] == null ? subEX.Attributes["DefaultValue"].InnerText : 
                                                                                   (iNode[subEX.InnerText].InnerText == subEX.Attributes["TestValue"].InnerText).ToString());
                        if (subEX.HasAttribute("type"))
                          switch (subEX.Attributes["type"].InnerText)
                          { case "DateTime.ToShortDate":
                              subXElements[++subidx] = new XElement(subEX.Name, iNode[subEX.InnerText] == null ? subEX.Attributes["DefaultValue"].InnerText :
                                                                                   DateTime.Parse(iNode[subEX.InnerText].InnerText).ToShortDateString());
                              break;
                            case "DateTime.ToString":
                              
                              break;          } }
                      MainElement[++idx] = new XElement(XElement2Load.Name, subXElements); }
                    else MainElement[++idx] = new XElement(XElement2Load.Name, iNode[XElement2Load.InnerText].InnerText); }
                  DocDeserializer = new XElement(MainElementName, MainElement); }
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900