Click here to Skip to main content
15,867,885 members
Articles / General Programming / Regular Expressions

A lightweight recursive text template data formatter

Rate me:
Please Sign up or sign in to vote.
4.92/5 (11 votes)
13 Mar 2013CPOL10 min read 39K   268   26   17
Filling in text templates from a data source.

Introduction

This article provides code that will take a text-based template and perform substitutions of variables identified in the template with data values extracted from a data source. Currently supported data sources are .NET class object instances, and IDataRecord and IDataReader objects (e.g., SqlDataReader).

Background

A lot of the programs that I write for my company involve sending emails to various people for one reason or another. Quite often, these emails are data-based - that is, I pull some data from a database and then have to format that data into the email before I send it. In the past, I have achieved this by reading a text template from the DLL resources, then using string.Replace to perform substitutions for predetermined variable names. Here's what I used to do:

C#
string SalesOrderItems = GetSalesOrderItemsText();
EmailData info = GetData();
string template = GetTemplate();
string emailBody = template
                      .Replace("{AccountManagerName}",  info.AccountManagerName)
                      .Replace("{SalesOrderNumber}", info.SalesOrderNumber)
                      .Replace("{SalesOrderItems}",  SalesOrderItems); 

 The GetTemplate() might return something some HTML like this:

HTML
{AccountManagerName}:<br />
Sales Order <b>#{SalesOrderNumber}</b> has been fulfilled.  It contains the following items:

<table>
    <tr>
        <th>Item #</th>
        <th>Part #</th>
        <th>Description</th>
        <th>Price</th>
    </tr>
    {SalesOrderItems}
</table>  

and the SalesOrderItems would have been built up in essentially the same way from a different template, and I would have generated many lines of items and subsequently joined them with string.Join(string.Empty, SalesOrderLinesList). The template for the SalesOrderItems might have looked like this:

HTML
<tr>
    <td>{ItemNumber}</td>
    <td>{PartNumber}</td>
    <td>{Description}</td>
    <td>{Price}</td>
</tr> 

After building up this email from multiple different templates, I would use the .NET libraries to send the mail off to the right people. Despite how well this has served me over the years, there are several problems with this method:

  1. If I want to add new fields to the template, I have to add code to the .Replace series of method calls  
  2. I have to have separate code to build the SalesOrderItems that gets replaced in the main email
  3. The template must be split into multiple pieces (one for the main email, one for the repeated items) 

Therefore, I recently decided to re-code how I was handling this need, and the code provided below is the result of that effort. There are still several improvements that can be made to this code, but it handles the first few scenarios that I required.

Advantages of the new code

The new version of my template-filling code has the following benefits and advantages over the old method:

  1. Easier variable names - Instead of using the {Property} syntax described above, which I found hard to type and hard to read, I now use @Property syntax, similar to SQL variables. 
  2. Not data-structure specific - It is fully self-contained, and does not require any coding specific to the data structure.  That is, you can pass it an object of type A or an object of type B, where A and B have totally different fields / properties. The code automatically locates variable names in the template and makes the appropriate replacements (it uses .NET reflection to obtain the data value).  
  3. Multi-level data - The code is capable of traversing into data structures using the @Property.SubProperty syntax. Thus, if A has a property called Data that is a class of type B, and class B contains a property called Name, you can use @Data.Name.  It can go as deep as you need. 
  4. Data formatting - It supports formatting data by using @Variable{Width}{Format}. The width and format strings are the same as that used in .NET string.Format, and fully supports all of the built-in type formats. For example, if object A contains a property called DateProcessed that is a DateTime, you can use @DateProcessed{12}{yyyy-MMM-dd} to format the output to your liking.  Both the width and format specifiers are optional - you can use both, either, or neither, but the width must come first. 
  5. In-line list handling - It supports in-line repeated replacements to any realistic level. Therefore, there is no longer a need to have separate template files for each portion of the template that is built up from lists. The only requirement is that the referenced object property (or object, if using @$) that needs to be repeated implements IEnumerable or IDataReader.  
  6. In-line conditionals - Simple numeric and string conditionals are supported to output either "this" or "that".   

Using the code

For each of the code samples below, I shall assume the following object structure:

C#
public class CustomerInfo
{ 
    public string Name;
}

public class PartInfo
{
    public string PartNumber;
    public string Description;
    public bool IsRestricted;
    public List<PartInfo> SimilarParts = null;
}

public class ItemInfo
{
    public PartInfo Part;
    public string Description;
    public decimal Quantity;
    public decimal UnitPrice;
    public decimal TotalPrice
    {
        get
        {
            return Quantity * UnitPrice;
        }
    }
}

public class OrderInfo
{
    public CustomerInfo Customer = null;
    public int OrderNumber;
    public List<ItemInfo> Items = null;
    public decimal TotalPrice 
    { 
        get 
        {
            return (Items != null) ? Items.Sum(i => i.TotalPrice) : 0M;
        } 
    }
} 

static void Main(string[] args)
{
    PartInfo widget1 = new PartInfo() 
    { 
        PartNumber = "ABC-001", 
        Description = "Widget #1",
        IsRestricted = true
    };
    PartInfo widget2 = new PartInfo() 
    { 
        PartNumber = "ABC-002", 
        Description = "Widget #2", 
        SimilarParts = new List<PartInfo>() { widget1 } 
    };
    PartInfo widget3 = new PartInfo() 
    { 
        PartNumber = "ABC-003", 
        Description = "Widget #3", 
        SimilarParts = new List<PartInfo>() { widget1, widget2 }
    };
    OrderInfo order = new OrderInfo()
    {
        Customer = new CustomerInfo() { CustomerName = "Michael Bray" },
        OrderNumber = 173123,
        Items = new List<ItemInfo>() { 
            new ItemInfo() 
            { 
                Part = widget1, 
                Quantity = 2, 
                UnitPrice = 30 
            },
            new ItemInfo() 
            {
                Part = widget2, 
                Quantity = 4.5M, 
                UnitPrice = 10 
            },
            new ItemInfo() 
            {
                Part = widget3, 
                Quantity = 60, 
                UnitPrice = 4.25M 
            }
        }
    };

    // Run examples
} 
Example 1: Simple data value substitution

This example shows how to perform basic substitution of variables. It is the "Hello World" example.

C#
string template = "Hello, your order #@OrderNumber has been fulfilled.  "
                + "The total price is @TotalPrice."; 
string filled = FillTemplate(template, order);

Output: Hello, your order #173123 has been fulfilled. The total price is 360.

Example 2: Multi-level data and data formatting

This example demonstrates the ability to dive into the object structure. Note how the code references the Customer.Name variable. It also demonstrates formatting a number - in this case the TotalPrice - with a format specifier "C" to cause the decimal property to display as a currency.

C#
string template = "Hello @Customer.Name, your order #@OrderNumber has been fulfilled.  "
                + "The total price is @TotalPrice{C}."; 
string filled = FillTemplate(template, order);

Output: Hello Michael Bray, your order #173123 has been fulfilled. The total price is $360.00.

Example 3: List processing & repeated data

In order to process lists of data, you must use a special construct within the template. Let me first give an example, and then I'll describe the construct.

C#
string template = "Hello @Customer.Name, your order #@OrderNumber has been fulfilled.  "
                + "The items are:\r\n\r\n"
                + "Part Number  Description          Quantity  Unit Price  Total Price\r\n" 
                + "-----------  -------------------  --------  ----------  -----------"
                + "@Items[[#\r\n"
                + "@Part.PartNumber{-12} @Part.Description{-20} @Quantity{8}{F2} "
                + "@UnitPrice{11}{C} @TotalPrice{12}{C}#]]\r\n"
                + "                                                        -----------\r\n"
                + "                                          GRAND TOTAL: @TotalPrice{12}{C}\r\n";
string filled = FillTemplate(template, order); 

Output:

Hello Michael Bray, your order #173123 has been fulfilled.  The items are: 
Part Number  Description          Quantity  Unit Price  Total Price
-----------  -------------------  --------  ----------  -----------
ABC-001      Widget #1                2.00      $30.00       $60.00
ABC-002      Widget #2                4.50      $10.00       $45.00
ABC-003      Widget #3               60.00       $4.25      $255.00                            
                                                        -----------
                                          GRAND TOTAL:      $360.00  

As might be obvious from the code above, in order to process lists of data with a "repeated template", you should use the syntax:

@PropertyName{Width}[[C  ....repeated template...  C]] 

where C is any character (in the example above, I use the # character). The character chosen for C serves as part of the closing tag that identifies the end of the repeated portion of the template. By choosing different characters for C, you can even have repeated templates inside other repeated templates!  For example, you might use something like:

C#
template = "Hello @Customer.Name, your order #@OrderNumber has been fulfilled.  "
           + "The items are:\r\n\r\n"
           + "Part Number  Description          Quantity  Unit Price  Total Price  Similar Parts\r\n" 
           + "-----------  -------------------  --------  ----------  -----------  -------------"
           + "@Items[[#\r\n"
           + "@Part.PartNumber{-12} @Part.Description{-20} @Quantity{8}{F2} @UnitPrice{11}{C} "
           + "@TotalPrice{12}{C}  @Part.SimilarParts{40}[[%@PartNumber,%]]#]]\r\n"
           + "                                                        -----------\r\n"
           + "                                          GRAND TOTAL: @TotalPrice{12}{C}\r\n";
filled = FillTemplate(template, order); 

Output:

Hello Michael Bray, your order #173123 has been fulfilled.  The items are:

Part Number  Description          Quantity  Unit Price  Total Price  Similar Parts 
-----------  -------------------  --------  ----------  -----------  -------------
ABC-001      Widget #1                2.00      $30.00       $60.00  
ABC-002      Widget #2                4.50      $10.00       $45.00  ABC-001,
ABC-003      Widget #3               60.00       $4.25      $255.00  ABC-001,ABC-002,
                                                        -----------
                                          GRAND TOTAL:      $360.00

Notice how the SimilarParts is generated by using a repeated template with [[% ... %]] inside the Items repeated template which itself uses [[# ... #]]. As long as you don't use a template delimiter character sequence inside another repeated template that uses the same delimiter character, you can essentially nest these repeated templates as far as you need to. (To be clear, the character itself can be used inside the template - it only has significance when adjacent to the ]] template termination characters.)

As with standard variables, the {Width} specifier is optional in this case, and if used will be used to format the string that is returned by the repeated template subexpression. It should only be used if you are formatting data on a single line - trying to use a {Width} for multi-line constructs such as the item list probably doesn't have much value.

Example 4: Conditionals

Conditionals are implemented with a syntax similar to that of repeated data:

@?[[C <Conditional> C]][[C <TRUE expression> C]][[C <OPTIONAL FALSE Expression> C]] 

Note that the conditional must use @? as the prefix, and have at least two subsequent clauses, one for the actual conditional, and one for the expression to output if the conditional is true. It can also be followed by an optional clause to be output if the conditional is false.

Currently, only very simple conditionals are implemented. It must be either a numeric (decimal) or a string comparison using the operators >, >=, <, <=, ==, or !=. The conditional is first evaluated as a numeric comparison, and if that fails (that is, if the expressions on both sides of the operator cannot be converted into decimal numbers) then the comparison continues as a case-sensitive string comparison. No compound expressions (using && (AND) or || (OR), for example) are currently allowed. White-space around the expressions in the conditional is ignored.

C#
template = "Parts List:\r\n"
    + "Part Number    Description            Is Restricted\r\n"
    + "-------------  ---------------------  -------------\r\n"
    + "@$[[% @PartNumber{-12}  @Description{-20}  "
    + "@?[[# @IsRestricted == True #]][[# YES #]][[# NO #]] \r\n%]]";
filled = FillTemplate(template, new List<partinfo>() { widget1, widget2, widget3 });

Output: 

Parts List: 
Part Number    Description            Is Restricted
-------------  ---------------------  -------------
 ABC-001       Widget #1              YES   
 ABC-002       Widget #2              NO   
 ABC-003       Widget #3              NO   

Minor Tidbits

There are a few other things of note:

  1. The code will attempt to locate both properties and fields. The variable name is probably case-sensitive. 
  2. There is a special variable called @$ which will return the object itself instead of trying to find a field or property. This might be handy in a few circumstances, for example if you have an override on .ToString() that you want to use as the output, or if you want to use a List as the data source (since the list itself doesn't have a property or field to access the items.)   Then, to access the list use a construct similar to:  Users: \r\n@$[[#@FirstName @LastName  --  @Email\r\n#]] 
  3. The code is designed so that it can look at both instance and static properties and fields, but I haven't tested it with static properties or fields. Similarly, it is coded so that it can access both public and non-public properties and fields, but I haven't tested non-public properties or fields. I recommend sticking to public instance fields or public instance properties.  
  4. The code is able to handle both object-based data sources (as primarily discussed in this article) but it can also handle IDataRecord objects and IDataReader objects such as SqlDataReader. (Note that SqlDataReader is in fact both an IDataRecord and an IDataReader!)
    1. If the object passed is only an IDataRecord, it will be treated as a single instance of a class would. However, when using an IDataRecord, multi-level data, repeated templates (single and nested) are not available, since the data structure provided by IDataRecord simply doesn't implement those concepts.
    2. If an IDataReader is passed, then it will iterate over all of the records passed, so repeated templates are available. Nested repeated templates are not available, since each iteration is treated as an IDataRecord. If you use a repeated template, you must use @$ as the main variable name, similar to the syntax described in point #2.  
  5. The implementation recursively evaluates all detected repeater elements first, then conditionals. By doing this recursively, there shouldn't be any conflicts with "conditionals inside repeaters" nor "repeaters inside conditionals" as long as you honor the nesting character issue (that is, don't use [[C C]] inside another another one that uses the same delimiting C character, even if one is a repeater and one is a conditional.)   The demo project has a more complicated example where several levels of nesting are used. (Specifically, it demonstrates a conditional that contains a repeater that contains a conditional.) 

Points of Interest

The code uses a single .NET Regular Expression to locate variables in the template that it needs to evaluate and replace, and a second one to locate conditionals. The evaluator regular expression is a bit wild, but not too difficult to understand:

(?<VarName>@(\w+(\.\w+)*|\$))({(?<Width>-?\d+)})?({(?<Format>.+?)}|
    ((?<Open>\[{2}(?<CloseC>.))).+?(?<SubExpr-Open>\k<CloseC>\]{2}))?   

I'll break it down into individual pieces:

(?<VarName>@(\w+(\.\w+)*|\$))  

This part locates a variable name consisting of either the $ special variable, or a repeated set of Property.SubProperty.SubProperty... Note in particular that the '.' characters must be contained within (surrounded by) alphanumeric characters - a period at the end of a variable name won't be matched (and it shouldn't). This allows you to put a variable name at the end of a sentence, as in Example #1 above.

({(?<Width>-?\d+)})? 

This part finds an optional .NET style width specifier. It must be an integer (although 0 has no valid meaning, and the .NET formatting mechanism might even reject it.)

For the last part, I will split this portion apart slightly so that it's easier to see what's happening... (In other words, ignore the line-breaks and other white-space!)

(
    {(?<Format>.+?)}
|   ((?<Open>\[{2}(?<CloseC>.))).+?(?<SubExpr-Open>\k<CloseC>\]{2})
)?  

This part finds either a format string contained inside curly braces, OR ELSE it finds a matching-delimited subexpression indicated by the [[C ... C]] syntax discussed in the "Using the Code" example #3. Note that this implies that it is invalid to have both a format specifier and a subexpression. This restriction is by design because there aren't any valid format specifiers for strings in .NET (although you can specify width).

The conditional regular expression is similar and I won't discuss it in detail here.

Known Limitations and Possible Future Enhancements

  1. If the member found is a Property (as opposed to a Field) and the Property is an indexed property, only the default property will be used 
  2. As demonstrated in Example #3 part 2 (nested repeated templates) there is no way currently to trim what we would desire to be strictly intervening characters. This is why the list of "similar parts" in the output not only has commas between the part numbers, but also at the end of the list.
  3. There is currently no code to account for the case where you would want an @ symbol next to characters/numbers in the template file without performing a replacement. If you need an @ symbol, just put a space after it and deal with it.

Code Notes

I have provided sample code that provides the source code for the FillTemplate method and demonstrates most of the features described above. The only thing that isn't fully covered is passing a SqlDataReader, although I do provide a template for use. (This is because I don't know what SQL Data sources you might have access to, and you'll need to fill in a bit of SQL code to see this feature in action. See the code for details.)

History

  • 2012 Dec 07 - Initial publication.
  • 2012 Dec 18 - Added Conditionals capability.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer (Senior) Presidio Network Solutions
United States United States
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
GeneralMy vote of 5 Pin
CandyJoin15-Mar-13 1:16
CandyJoin15-Mar-13 1:16 
QuestionDotLiquid Pin
Stephen Brannan13-Mar-13 10:51
Stephen Brannan13-Mar-13 10:51 
AnswerRe: DotLiquid Pin
Michael D Bray13-Mar-13 10:57
Michael D Bray13-Mar-13 10:57 
Because I could. Because it was fun. Because I like to build things for myself. Because I like to figure things out. Because I'm picky about having intimate understanding of the code I use. Because I'd never heard of DotLiquid or Razor. Because I wanted something I could write a CodeProject article about.

Take your pick. Smile | :)
GeneralRe: DotLiquid Pin
Stephen Brannan13-Mar-13 11:09
Stephen Brannan13-Mar-13 11:09 
GeneralRe: DotLiquid Pin
Michael D Bray13-Mar-13 11:16
Michael D Bray13-Mar-13 11:16 
GeneralMy vote of 5 Pin
Marc Clifton19-Dec-12 4:02
mvaMarc Clifton19-Dec-12 4:02 
QuestionRazorEngine Pin
Yingbiao18-Dec-12 12:41
Yingbiao18-Dec-12 12:41 
AnswerRe: RazorEngine Pin
Michael D Bray18-Dec-12 15:35
Michael D Bray18-Dec-12 15:35 
AnswerRe: RazorEngine Pin
Oleg A.Lukin14-Mar-13 21:34
Oleg A.Lukin14-Mar-13 21:34 
QuestionMissing optional cultureinfo Pin
dgauerke18-Dec-12 10:29
dgauerke18-Dec-12 10:29 
AnswerRe: Missing optional cultureinfo Pin
Michael D Bray18-Dec-12 15:39
Michael D Bray18-Dec-12 15:39 
SuggestionRefactorings.... Pin
Andrew Rissing18-Dec-12 9:58
Andrew Rissing18-Dec-12 9:58 
GeneralRe: Refactorings.... Pin
Michael D Bray18-Dec-12 16:02
Michael D Bray18-Dec-12 16:02 
GeneralMissing code Pin
Andrew Rissing18-Dec-12 8:52
Andrew Rissing18-Dec-12 8:52 
GeneralNm... Pin
Andrew Rissing18-Dec-12 8:59
Andrew Rissing18-Dec-12 8:59 
GeneralRe: Nm... Pin
Michael D Bray18-Dec-12 15:44
Michael D Bray18-Dec-12 15:44 
GeneralRe: Nm... Pin
Michael D Bray18-Dec-12 15:47
Michael D Bray18-Dec-12 15:47 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.