Click here to Skip to main content
15,867,453 members
Articles / Programming Languages / C#

Enhanced Configuration File Handling

Rate me:
Please Sign up or sign in to vote.
4.43/5 (8 votes)
27 Mar 2009CPOL14 min read 50.8K   242   60   8
Configuration files can be enhanced and extended using config-variables.

Succeeding Article

Enhanced String Handling

Abstract

When designing a config file, at times it is comfortable (and less hassle) to use a variable specification as opposed to using the literal value. For example:

<add key="TestFile" value="{key::BaseDir}\FileName"/>

as opposed to

<add key="TestFile" value="c:\somedirectory\FileName"/>

I will refer to the {key::BaseDir} as a config-variable. This is possible with a bit of programmatic intervention to handle config-variables. The article discusses the effort entailed in enhancing the default handling of a config file to allow config-variable specifications.

Assumption about You, the Reader

I assume that present company has some rudimentary understanding of .NET handling of a config file. The code was built and tested using VS2008 /.NET 3.5 though the majority of the information holds true for previous versions of .NET and VS.

Introduction

This is a first of two installments of the article. The second installment will build on top of the knowledge developed here and build logic into the configuration file. The second installment can be found here.

Consider the following sample config file, employing an example of config-variable handling key: BaseDir, within key: TestFile:

XML
<?xml version="1.0" encoding="utf-8" ?>
<configuration>
         <appSettings>
                 <add key="TestFile" value="{key:: BaseDir}\FileName"/>
                 <add key="BaseDir" value="c:\somedirectory"/>
                 <add key="another-key" value="c:\some-other-directory"/>
         </appSettings>
</configuration>

and the expected evaluation of TestFile is: c:\somedirectory\FileName. The rest of this article discusses how to handle, programmatically, such a config-variable evaluation.

There are two kinds of config-variable evaluation:

  • The one is a “one time” evaluation; once the config-variable is evaluated it needs no further attention. For example {Date::yyyy.mm.dd} once it is evaluated to 2009.02.31 (just joking) then it is done.
  • The second is a recursive, iterative or multi-pass, evaluation; it potentially needs multiple passes of evaluation to be performed. For example see the above depicted configuration, key: TestFile uses internally a value obtained from another key: BaseDir. Now, the key: BaseDir may itself use {key::another-key} therefore, the evaluation may potentially need multiple passes.

We need to be vigilant over this multi-pass business not to fall into the trap of having an infinite loop or infinite recursion. Consider the scenario where key1 expands key2 for its evaluation and key2 expands key1 for its evaluation.

Key Idea for Config-Variable Evaluation

Programmatic Key Idea

We need to have the configuration entry values stored in C# variables and a mechanism whereby we can cycle through all the C# variables and evaluate the config-variables. So for example the program needs to store the config entry: <add key="TestFile" value="{key:: BaseDir}\FileName"/> value (“{key:: BaseDir}\FileName”) in a C# variable. Then, the system needs to cycle through all these C# variables to evaluate the config-variables.

I used a data dictionary as the data type to house all the C# variables corresponding to the config-entries.

Specification Key Idea

The key idea behind the specification of a config-variable is to use a notation that is not likely to be used for anything else. Let’s look at the two examples that this article will follow:

  • {Date::yyyy.mm.dd}
  • {key::BaseDir}

The notational pattern (that I choose) is {EvaluationKey::EvaluationValue}. Where, in the examples above, EvaluationKey is Date for the former and Key for the latter config-variables. While the EvaluationValue part is: yyyy.mm.dd for the former and BaseDir (a configuration’s key-name) for the latter configuration-variables. The assumption here is that neither {Date::yyyy.mm.dd} nor {key::BaseDir} will be needed in a config file for anything else except for a config variable specification. This seems, to me, like a reasonable assumption, right now.

The use of a double colon (“::”) as a separator set of characters between EvaluationKey and EvaluationValue, needs another word. The choice of a single colon “feels” right, but the colon character is also a path character; as in: c:\Temp\abc.tmp. This makes the colon a less than ideal character as a separator character. Other characters I considered are the semicolon (“;”) and the vertical bar (“|”), also called vbar, pipe and more names. I decided to go with the double colon because I expect it to be a good option also when facing human errors.

If you feel that you need to use a different separator then you may easily change the notation to suit your needs. In the accompanying example the classes ProcessDate and ProcessKey are the only place where the pattern for the {EvaluationKey::EvaluationValue} is used. You will need to change the variable pattern in the first statement in the static constructor of both.

Evaluation Key Idea

The key idea behind the evaluation is a regular expression match. The lookup is captured in two helper classes: ProcessDate and ProcessKey. ProcessDate defines a regular expression match pattern as such: @"{\s*Date\s*::(?<EvalVal>.*?)}". ProcessKey defines a similar expression: @"{\s*Key\s*::(?<EvalVal>.*?)}".

I will review ProcessDate’s regular expression string pattern. If you are familiar with regular expressions and the string pattern, @"{\s*Date\s*::(?<EvalVal>.*?)}", is clear to you then skip over to the next section. If you are unfamiliar with the regular expression concept then the following explanation will do you no good and you will be better off trusting that it works and move on. (For the interested reader: see “Regular Expressions” on MSDN or a good article about Regular Expressions will be a good start). For you who are somewhat familiar with the regular expression concept and just need a little bit of hand holding then the following paragraphs are for you.

The pattern will start matching on an open curly brace (“{”) followed with a potential white space (“\s*”, 0 or more white space characters) followed by a literal string “Date” (the EvaluationKey) then another potential white space followed with a double colon set of characters (“::”) followed by the EvaluationValue up to and including the first closing curly brace. So everything we are interested in, the EvaluationValue, is captured by (?<EvalVal>.*?).

Every regular expression is divided into a set of groups. The entire expression is group 0 then every open parenthesis is a start of a new group which is numbered sequentially starting with 1. Instead of counting parentheses, you may name a group. The “?<EvalVal>” as the first construct after the open parenthesis character means that the group will be named “EvalVal.”

The last point about the trailing question mark that is crucial: the group named “EvalVal” in the pattern: @"{\s*Date\s*::(?<EvalVal>.*?)}" ends with a question mark after the dot asterisk notation (“.*?”). Without the question mark: (“.*”), the pattern will match just about everything under the sun[1]. So, for an expression like

<add key="TestFile" value="{Date::yyyy.mm.dd}-{key::anotherKey}"/>

the pattern that does not have the trailing question mark, will match a date having its EvaluationValue: yyyy.mm.dd}-{key::anotherKey. The match ends on the last closing brace. Obviously this is not the intended result. Having the trailing question mark in the pattern, the expression will end matching on the first closing curly brace—as intended.

Limitation of the “.*?” notation

This notation, @"{\s*Date\s*::(?<EvalVal>.*?)}", does not allow for nested config-variables. It will not handle the following config entry correctly:

<add key="TestFile" value="c:\{Date::{key::DateFormat}}\FileName"/>

As the “.*?” will start after the “{” character and end before the first “}”. As such the “*.?” will match “{Date::{key::DateFormat}”.

We will keep this limitation in mind and plan our config file accordingly, at least for this installment of the article. This is not a major drawback and simplicity is a virtue.

Code Review

Class Overview

The Config class inherits from ConfigEval which does the heavy lifting of the evaluation. ConfigEval, in turn, uses the ProcessKey and ProcessDate helper classes to accomplish its evaluation. This basically is the entire program. Short and I hope easy to understand.

Configuration File

The Config class is responsible for retrieving the config-entries values, in our case the config file is captured in app.config. The app.config file is copied to <module-name>.exe.config at the end of each successful compilation. The app.config is depicted below for your convenience:

XML
<?xml version="1.0" encoding="utf-8" ?>
<configuration>
         <appSettings>
                 <add key="TestFile" value="{key::TestDir}FileName"/>
                 <add key="Root" value="c:\"/>
                 <add key="HomeDir" value="{key::Root}HomeDirectory\"/>
                 <add key="TestDir" value="{key::HomeDir}{Date::yyyy.mm.dd}\Test\"/>
         </appSettings>
</configuration>

Do note that we have forward referencing config-variables as well as backward referencing config-variables. The key TestFile is a forward referencing config-variable as its EvaluationValue is defined further ahead. We also have HomeDir, a backward referencing config-variable as its EvaluationValue has already been defined. They are both handled correctly by our code.

C# Variables Access

We have corresponding C# properties: Config.Inst.TestFile, Config.Inst.Root, Config.Inst.HomeDir and Config.Inst.TestDir corresponding to each one of the app.config file entries. As defined in the Config class and attested in the Program class (see accompanying code).

Config Class

C#
public static readonly Config Inst = new Config();
private Config()
{
        ...
}

The Config class starts by making itself a singleton. .NET gives us a very convenient way to make a class singleton.

The constructor of the Config class extracts all the entries in the configuration file into a data dictionary named: _configEntry (defined in the ConfigEval base class). From there the constructor moves on to do the magic of evaluation.

One nasty error that may crop up due to no fault of the developer is a human error that leaves the config file in an ill xml format. This can be nasty as the system at times does not throw an exception but keeps on complaining that the values extracted from the config file are null. Or worse yet it reports a bogus error. I have deliberately taken out the check from the explanation here as it is not part of the config-variable discussion, and I left it is part of the accompanying code.

C#
try
{
        // Extract from config file
        string value = ConfigurationManager.AppSettings[_keyTestFile];

        // Add extracted value to _configEntry dictionary
        _configEntry.Add(_keyTestFile, value);
}
catch (ConfigurationException ex)
{
        ReportError(ex);
}
...
//
// This is where the magic of evaluation happens
//
EvaluateConfigNonRepeating();
EvaluateConfigRepeating();
...
private const string _keyTestFile = "TestFile";

The constructor of the Config class, in part, is depicted below:

The const string _keyTestFile at the bottom of the above code snippet is the only line outside the constructor definition.

The Config class defines a getter, a get property, for each one of the app.config entries, see code snippet bellow:

C#
private const string _keyTestFile = "TestFile";
private string _internalTestFile = null;
private static readonly string _defaultTestFile = string.Empty;
public string TestFile
{
        get
        {
                 if (_internalTestFile != null)
                          return _internalTestFile;

                 string testFile = _configEntry[_keyTestFile];
                 if (testFile == null)
                 {
                     ReportError(string.Format(
                          "Missing TestFile in config file," +
                         " default value of \"{0}\" will be used",
                          _defaultTestFile));
                     _internalTestFile = _defaultTestFile;
                     return _internalTestFile;
                 }

                 _internalTestFile = testFile;
                 return _internalTestFile;
        }
}

Reading the configuration file and extracting C# variables was done in the constructor where no data-error checking is done. The data-error checking is done during the first “pass through” of the getter. So if a configuration entry is not used, then no data-error checking will be performed and the user is spared an irrelevant error message. More importantly, if the getter is used multiple times and a warning is issued, then the warning appears only once. To ensure that data-error checking is done only once, I created an _internalTestFile variable and initialized it to null (for emphasis). The property checks to see if _internalTestFile is null. If it is not null then the process ran previously and it need not be repeated.

If translation from one data type to another is needed this is the place to do it. For example: the config file returns a string value—always, if however, it represents a Boolean value either “TRUE” or “FALSE” (after data-error checking) then we can translate it to a Boolean variable that will make more sense to the calling routine. Therefore, the type of the return value from the key getter need not be a string. (If you have a Boolean return value you may care to use bool? (Nullable<bool>) as the data type for your _internalVariable. Otherwise the check for null may prove difficult.)

The pattern above is repeated for every entry of the config file.

ConfigEval—the Base Class

Config is derived from ConfigEval which in turn has the following benefits:

  • A general evaluation that can be used in a different project.
  • ConfigEval provides a place holder for all the C# variables pertaining to the config file keys in a single construct, a dictionary: _configEntry.
C#
protected virtual void EvaluateConfigNonRepeating()
  List<string> keys = (from k in _configEntry select k.Key).ToList();
    foreach (string k in keys)
               _configEntry[k] = ProcessDate.Evaluate(_configEntry[k]);
               }
  • ConfigEval provides two functions: EvaluateConfigNonRepeating() and EvaluateConfigRepeating() that rely on helper classes in order to perform the config-variable evaluation. Where the former is for “one time” evaluation and the latter is for “multi pass” evaluation.

I would like you to notice that I have created a keys local variable of type List<string> and looped through the keys like so:

C#
List<string>

keys = (from k in _configEntry select k.Key).ToList();
foreach (string k in keys)
    _configEntry[k] = ProcessDate.Evaluate(_configEntry[k]);

as opposed to:

C#
foreach (string k in _configEntry.Keys)
    _configEntry[k] = ProcessDate.Evaluate(configEntry[k]);

Because one may not loop on the dictionary keys if one is to modify any part of the dictionary within the loop.

The second function EvaluateConfigRepeating() adds the complexity of iterating through the _configEntry until the last pass through, where nothing was changed.

Notice one would expect this function to be recursive; however, it is a tail recursion so iteration works more efficiently. Let me explain more closely:

The algorithm for EvaluateMultiPass(..) is captured in the following pseudo code:

Algorithm for EvaluateMultiPass(configKey):
IF no match on {key:: ..} pattern is found then exit function
// The EvaluateGroup function will evaluate one key pattern only
configKey = EvaluateGroup(configKey) // new configKey may contain new key pattern
EvaluateMultiPass(configKey) // Recursive call

This algorithm is clearly a tail recursion. The following code snippet depicts an iterative solution to EvaluateConfigRepeating():

C#
protected virtual void EvaluateConfigRepeating()
     {
             // Use PassThroughUpperLimit to avoid the infinit loop problem
             const int PassThroughUpperLimit = 500;
             bool evaluated = false;            // Use for termination condition
             string evalTxt;           // output of evaluation
             List<string> keys = (from k in _configEntry select k.Key).ToList();
             for (int i = 0; i < PassThroughUpperLimit; ++i)
             {
                      evaluated = false;
                      foreach (string k in keys)
                      {
                               bool rc = _processKey.Evaluate(_configEntry[k],
                                     out evalTxt);
                               if (rc)
                               {
                                        _configEntry[k] = evalTxt;
                                        evaluated = true;
                               }
                      }

                      // Did we go through one complete pass with no evaluations?
                      if (!evaluated) break;
             }

             // Did we exceed PassThroughUpperLimit and still evaluating?
             if (evaluated)
                      ReportError(string.Format(
                                 "Potential circular translation as " +
                               "we exceeded {0} iterations",
                                 PassThroughUpperLimit));
     }

ProcessDate Helper Class

As helper to the evaluation we employ helper classes. Each one of those helper classes is responsible for a specific evaluation. The first of which is ProcessDate, a relatively simple class depicted below in its entirety.

C#
public static class ProcessDate
{
        static ProcessDate()
        {
                 string pattern = @"{\s*Date\s*::(?<EvalVal>.*?)}";
                 RegexOptions reo = RegexOptions.Singleline |
                     RegexOptions.IgnoreCase;
                 _reDate = new Regex(pattern, reo);
        }

        public static string Evaluate(string text)
        {
                 if (text == null) return null;
                 if (text == string.Empty) return string.Empty;
                 string retText = _reDate.Replace(text, DateReplace);
                 return retText;
        }

        private static string DateReplace(Match m)
        {
                 DateTime today = DateTime.Today;
                 string txt = m.Groups["EvalVal"].Value;
                 txt = txt.Replace("yyyy", today.ToString("yyyy"));
                 txt = txt.Replace("yy", today.ToString("yy"));
                 txt = txt.Replace("mm", today.ToString("MM"));
                 txt = txt.Replace("m", today.ToString("%M"));
                 txt = txt.Replace("dd", today.ToString("dd"));
                 txt = txt.Replace("d", today.ToString("%d"));
                 return txt;
        }

        private static Regex _reDate;
}

The Evaluate() function is the main attraction, it is the entry point to this class (a surface point). It employs the regular expression Replace() function to do the heavy lifting. Replace(..), in turn, uses a callback function DateReplace(..) to evaluate the specifics of each match of @"{\s*Date\s*::(?<EvalVal>.*?)}".

Notice that in the DateReplace(..) function, when it comes to a single character specification of ToString() like ToString("M") and ToString("d") the code uses ToString("%M") and ToString("%d"). This is not an oversight, it is correct behavior. ToString("M") will produce “February 21” and ToString("d") will produce “2/21/2009” both of which are not the desired behavior.

The DateReplace(..) function is called by the regular expression evaluator it processes the line: _reDate.Replace(text, DateReplace); in the Evaluate(..) function. Its input "Match m" contains the "EvalVal" part of the string, the part that needs evaluating. The entire input to DateReplace(..) is in: m.ToString().

ProcessKey Helper Class

C#
public class ProcessKey
{
        static ProcessKey()
        {
                 string pattern = @"{\s*Key\s*::(?<EvalVal>.*?)}";
                 RegexOptions reo = RegexOptions.Singleline |
                     RegexOptions.IgnoreCase;
                 _reKey = new Regex(pattern, reo);
        }

        public ProcessKey(Dictionary<string, string> evalDic)
        {
                 _evaluationDictionary = evalDic;
        }

        public bool Evaluate(string text, out string evaluatedValue)
        {
                 evaluatedValue = text;
                 if (text == null) return false;
                 if (text == string.Empty) return false;
                 evaluatedValue = _reKey.Replace(text, KeyReplace);
                 bool valueChanged = (text != evaluatedValue);
                 return valueChanged;
        }

        private string KeyReplace(Match m)
        {
                 string txt = m.Groups["EvalVal"].Value;
                 // Substitute a different key if one exists
                 if (_evaluationDictionary.ContainsKey(txt))
                          return _evaluationDictionary[txt];
                 return txt;
        }

        private Dictionary<string, string> _evaluationDictionary;
        private static Regex _reKey;
}

This class is just as simple as ProcessDate. Unlike ProcessDate it is not a static class as it needs access to the _configEntry dictionary entries. It could have been done using a static class but I felt better leaving it as a non-static class.

KeyReplace(..) now needs to return the value in (key::value) from a key lookup in the same config file. This is where the _configEntry dictionary comes into play. In this class the variable _evaluationDictionary points to the _configEntry of the ConfigEval class.

Related Resources

At times you will need to access a different section of the config file than appSettings. If you are working with VS2008 then there is great designer published in Codeplex: Configuration Section Designer. I have employed it successfully and I can attest to its benefit and enormous productivity gain. If, however, you are working with an earlier version than VS2008 then you may care to read Jon Rista’s superb articles (in Code Project):

When accessing different sections then the appSettings section, you will need to change the access method to the information. So far the Config class was tailor made for the appSettings section so it uses:

  • ConfigurationManager.AppSettings[..];

to access the information (in the getter properties). This now will have to change based on the section needed and the access provided by either CSD from Codeplex or by your own code as you wrote or as you followed Jason Rista’s teaching.

Other Scenarios

C#
ExeConfigurationFileMap myMap = new ExeConfigurationFileMap();
myMap.ExeConfigFilename = @"C:\path\bin\Application.exe.config";
Configuration cs = ConfigurationManager.OpenMappedExeConfiguration(
                 myMap, ConfigurationUserLevel.None);
string testFile = cs.AppSettings.Settings["TestFile"].Value;

At times you will need to access a configuration file that is not the program’s default configuration file. Such a config file would be handled using a configuration file map like so:

The Config class will need to change the way it handles the situation. Instead of using ConfigurationManager.AppSettings[..] you will be using cs.AppSettings.Settings[..].Value.

The Big-Win

Lastly, I would like to touch upon another scenario; a scenario that I believe is big win to the config-variable capabilities. Consider the following scenario: you may have to extract an IP address from a flat file dictionary having a list of (server-name, IP-address) pairs. The config-variable that you may have is:

{ServerIP::full-path or unc-path of flat file::server-name}

You will still need to write a class to do the work of extracting the IP-address given a server name, but now your config file is more expressive. This is a big win for the configuration-variable as now you have the capability to enter into the config file entries that you obtain from a source that may have proprietary format. Now, we have expanded the reach of the configuration file.

Epilog

I have tried to capture the most pertinent parts of configuration-variable handling. I hand waved around some other aspects of configuration-variables and I ended with the “Big Win” the new reach capability of a configuration file.

Enjoy.

--Avi Farah

[1] The dot notation “.” In a regular expression will match any character except the new line (“\n”) character or any character if the Singleline modifier option is used. We are using the Singleline modifier option!

The asterisk “*” notation matches the preceding element zero or more times. The asterisk “*” is a greedy quantifier that matches as much as it can. Its non-greedy equivalent is “*?” also called lazy quantifier matches as little as it can.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
United States United States
avifarah@gmail.com

Comments and Discussions

 
GeneralGood article Pin
Donsw9-Apr-09 6:44
Donsw9-Apr-09 6:44 
GeneralRe: Good article Pin
Avi Farah10-Apr-09 4:05
Avi Farah10-Apr-09 4:05 
GeneralGood idea Pin
Graham Cottle28-Feb-09 11:23
professionalGraham Cottle28-Feb-09 11:23 
GeneralRe: Good idea Pin
Avi Farah28-Feb-09 15:11
Avi Farah28-Feb-09 15:11 
Graham,
In a day or two a new face to this article will be posted and you will not have to read through the .docx file to understand the code.

It seems to me that you may get away with the contents of the article for your multi machine/config problem by keeping a (key,value) pair somewhere in a different location and read the location from within the config file. You will need to write the class that reads the different location and I believe the explanation for writing the code is clear.

There is another way. It is to add logic to the config file. This, btw, will be the subject of my next article that I am working on. You need not wait until I am done with my article but rather do it yourself, you will need a class that reads the logic and interprets it so for example if your config entry reads:
<key value="{if({MachineID::FullPathOfFile::ID}={Machine1}),{connectionString1},{if({MachineID::FullPathOfFile::ID}={Machine2}),{connectionString2},{if({MachineID::FullPathOfFile::ID}={Machine3}),{connectionString3},{Error}}}}">

Where the basic if statement looks like:
{if(condition),trueCondition,falseCondition}
Any one of these triads may contain a {key::value} config-variable.

So, to explain the compound if stmt:
The if condition is: ({MachineID::FullPathOfFile::ID}={Machine1}) where you will need to write a class to identify the config-var having its key: "MachineID". Config-var will also have file name where you will supply a full path to and an ID that will identify your machine. So {MachineID::FullPathOfFile::ID} will yield an ID for the machine, a string. If this string == the literal "Machine1" (given within curley braces since we cannot use the double quote notation) then condition 1 is met. etc...

Now your job is to write a class handling the condition.

Graham, if I was not clear please let me know.

Cheers,
Avi
GeneralRe: Good idea Pin
Graham Cottle1-Mar-09 9:54
professionalGraham Cottle1-Mar-09 9:54 
GeneralRe: Good idea Pin
Avi Farah2-Mar-09 0:54
Avi Farah2-Mar-09 0:54 
GeneralRe: Good idea Pin
Graham Cottle2-Mar-09 8:13
professionalGraham Cottle2-Mar-09 8:13 
GeneralRe: Good idea Pin
Avi Farah2-Mar-09 14:45
Avi Farah2-Mar-09 14:45 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.