Click here to Skip to main content
15,867,568 members
Articles / Web Development / ASP.NET

C#-like Expression Evaluator and Type Converter

Rate me:
Please Sign up or sign in to vote.
5.00/5 (13 votes)
22 Apr 2010CPOL10 min read 59.9K   1.6K   39   8
Convert types, parse and evaluate expressions in runtime, in .NET 2.0
eval-samples

Contents

Introduction

Converting user input to strongly typed values is part of almost any application. Even servers, which do not have any user interface, often have to parse text/XML configuration files and interpret text values written by humans into something more usable.

For example, imagine a simplistic application that downloads files from the Internet and saves them to a local disk. File attributes of the saved files are read from configuration files or command line, and then converted to FileAttributes enumeration. For "read only + normal" attributes administrator may have typed "0x081", or "129", or "ReadOnly|Normal", or "normal Readonly". All these are unambiguous and make sense, but parsing them is surprisingly difficult and verbose. Something simpler is needed, ideally as simple as:

C#
FileInfo f=new FileInfo("x.txt");
f.Attributes=Utils.To<FileAttributes>("ReadOnly | 0x80"); 

Another related and common task is parsing and evaluating expressions. These come useful in many scenarios, from COM interop to passing sorting expression from client to server, from interpreting complex configuration files to hiding System.Reflection verbosity.

There many partial solutions to the expression evaluation built in into .NET framework. From using CSharpCodeProvider to LINQ, from calculated columns in ASP.NET DataTable to using reflection and Reflection.Emit. Of course it's also possible to add IronPython, etc. While solving the problem, these solutions add hundreds of milliseconds to execution time (like calling C# compiler), or are not easy to extend/modify, or are complex, or are not open source, or eat lots of memory, or use syntax not intuitive to C# programmers, or add unwanted dependencies. Things get a bit easier with .NET 4.0 and dynamics, but XSharper framework/scripting language needed a solution that works with only .NET 2.0.

Writing code for evaluating expressions and even programming languages is certainly a lot of fun, thousands of books and examples exist, and there are many good CodeProject articles devoted to the topic. Better yet, there are mature projects like ANTLR or Coco/R that can be used to generate lexers and parsers for any language.

Yet I wanted something simpler, smaller and concentrated more on the evaluation part. A complete expression evaluation engine that would be capable of parsing and evaluating "normally looking" C# expressions (yet with more relaxed syntax) and would work with normal .NET libraries. Something much much simpler than full C# intepreter, definitely not a complete programming language. Something that is compact and easy to embed into any project as an assembly or directly as a bunch of source code files, and that would not have any dependencies at all except C# compiler.

So, the idea with doing the parser "properly", through compiling grammar into thousands of lines of generated C# code, was abandoned. And it's more fun to write from scratch anyway :). The core is based on shunting-yard algorithm, with some C#-specific stuff, such as typecasts, array initializers, short-circuting etc. added. A rather messy part was dealing with evaluation, that was particularly hard because I wanted to separate parsing and type binding completely. This adds flexibility, but parser cannot determine whether x.y.z is a static property z of type x.y, or a property z of field y of object x, or some other combination of the above, and this decision is postponed until runtime.

At the end, there is an expression evaluation engine that gets relaxed type conversion and Eval done and is simple to use:

C#
var context=new BasicEvaluationContext();
context.Objects["a"]=20;
context.Objects["b"]="Hello";
Console.WriteLine(context.Eval("a+b.Length"));

It can be used for COM interop too, hiding reflection verbosity and without requiring .NET 4.0 or typelib import. For example, the following prints information about the current SSL certificate of the local Internet Information Server:

C#
var context=new BasicEvaluationContext();
context.SetVariable("iis", Utils.CreateComObject("IIS.CertObj"));
var cert=context.Eval<string>(@"
    $iis.set_ServerName('localhost'); // set ServerName property to local IIS server
    $iis.set_InstanceName('w3svc/1'); // set InstanceName property to the first website 
    Encoding.Unicode.GetString($iis.GetCertInfo());
");
Console.WriteLine("Current certificate is:\n{0}",cert);

Expression Syntax

Writing a complete standard-compliant C# interpreter or compiler was not a plan (why bother if a real compiler is csc.exe call away?), so some features were cut and simplifications made to keep complexity under control.

Unsupported Features

  • Assignment, lvalues, postfix increment, etc. Writing a=b+c is not possible. Neither is b++ nor a[n=b++]=-8. Properties can still be set using x.set_PropertyName(value) syntax.
  • Overridden operators. Sorry, DateTime and TimeSpan cannot be added together as date+span. date.Add(span) syntax has to be used instead.
  • C# 3 initializers, like new X{ a=3 }
  • Anonymous classes, LINQ, delegates, events, etc.
  • Escaped characters. "Hello\n\r" is a string consisting of 9 characters.
  • Templates (it is possible to call methods of templated objects, but not to create them).
  • Multidimensional arrays, like new int[4,2]. Jagged arrays, like a[b][c], are fine.

Supported Features

  • All the usual +,-,*,/, ||, &&, as, is, typecasts, etc.
  • Conditional operator, X?Y():Z() which properly short-circuits.
  • Null-coalescing operator a??b.
  • Logical operators || and &&, which short-circuit as well.
  • Comments (// and /* */)
  • Arrays, with initializers. For example int[] { 1,2,3}
  • new and throw operators, like new FileInfo("C:\File.exe")
  • Namespaces

There are also additional features that will be explained in more detail below.

Using the Code

The most important classes in the Eval library are:

  • ParsingReader, derived from TextReader, with some useful methods for parsing text input: reading quoted string and numbers from the stream (in all variety of 0x3233l, 0.21e31 and 211.2m syntaxes), skipping white space, etc. It also maintains history, so if ParsingException is thrown, it can show where exactly the problem occurred.
  • Utils class contains a bunch of static utility methods, with To<T>(object) being particularly useful for easy conversions between types.
  • Parser class converts input text to an expression tree.
  • IOperation interface represents a single node in the expression tree.
  • IEvaluationContext interface is a context used for expression tree evaluation.
  • BasicEvaluationContext is simple implementation of IEvaluationContext interface.

Generally the code is supposed to be used as below, with parser created, expression parsed, and then evaluated on stack:

C#
TextReader input = ...; // This is the source of the expression

using (ParserReader reader=new ParserReader(input))
{
    // Parse text to expressionTree
    Parser parser=new Parser();
    IOperation expressionTree=parser.Parse(reader);

    // Evaluate the expression tree in certain context on stack
    IEvaluationContext context=new SomeClassImplementingIEvaluationContext();
    Stack<object> stack=new Stack<object>();
    expressionTree.Eval(context,stack);

    // On top of the stack there is a return value
    object resultValue=stack.Pop();
    Console.WriteLine("Result: "+resultValue);
}

If just something simple is needed, BasicEvaluationContext can make it as brief as:

C#
new BasicEvaluationContext().Eval("Console.WriteLine('Hello, world')");

IEvaluationContext

IEvaluationContext provides values to objects referenced in expressions and gives access to the type system and external methods. This interface has 6 methods:

C#
/// Evaluation context, providing interface to the external resources
public interface IEvaluationContext
{
    /// Get external variable (variable specified as $xxx. For example, $x+$y )
    bool    TryGetValue(string name, out object value);

    /// Find external type. Returns null if type not found
    Type    FindType(string name);

    /// Call external method
    object  CallExternal(string name, object[] parameters);

    /// Try to get external object. This is different from variable 
    /// and not prepended by $. For example, c.WriteLine('hello')
    bool    TryGetExternal(string name, out object value);

    /// Get list of no-name objects or type to try methods that start with .
    IEnumerable <TypeObjectPair> GetNonameObjects();

    /// Returns true if private members may be accessed
    bool    AccessPrivate { get; }
}  

Simple and ready to use implementation of IEvaluationContext is included as BasicEvaluationContext, see its source code for more details.

First of all, interpretation engine has two different concepts: external objects and values:

  • External objects are accessed through normal C#-like identifiers that cannot contain spaces and special characters. For example, evaluation of a+b will call TryGetExternal twice with "a" and "b". BasicEvaluationContext adds null, true and false to its list of objects by default.
  • Variables are similar to objects but referenced in expressions using Perl/PHP-like syntax, with $ prefix and optional {} ( useful for spaces and non-ASCII characters in the name ). In case of $price * ${number of items} the engine will call TryGetValue with "price" and "number of items". Variable name can also be an empty string, for example $.ToString().Length+${}.ToString().Length evaluation would call TryGetValue("") twice.

Resolution of type names to .NET types is up to the implementation of FindType method. This can be used as a basic security feature, so list of types available to the expression may be restricted without using Code Access Security.

Obviously, expression can call methods and access properties of any object using the usual obj.method() syntax. When object is not specified, like in cos(x)/sin(x), CallExternal interface method is responsible for finding and calling an appropriate method or throwing an exception if it's not possible:

C#
class EvalContext : BasicEvaluationContext
{
    public override object CallExternal(string name, object[] parameters)
    {
        switch (name.ToUpperInvariant())
        {
            case "SIN":
                if (parameters.Length==1)
                    return Math.Sin(Utils.To<double>(parameters[0]));
                break;
            case "COS":
                if (parameters.Length==1)
                    return Math.Cos(Utils.To<double>(parameters[0]));
                break;                  
        }
        return base.CallExternal(name,parameters);
    }
}
void Main()
{
    var context=new EvalContext();
    context.Objects["pi"]=Math.PI;
    Console.WriteLine(context.Eval("cos(pi)/sin(pi/4)"));   
}
// -1.4142135623731    

Expression may also automatically call methods of certain objects or classes via special dot-syntax. For example, trigonometry functions are defined in class MyTrig. Now, if the implementation of GetNonameObjects returns an instance of MyTrig object, the expression may be written as ".Cos(x)/.Sin(x)", and it will call the appropriate methods of that MyTrig object. This syntax difference may be used to avoid name collisions between built-in and user-defined functions.

C#
class MyTrig 
{
    private bool _useDegrees=false;
    public MyTrig(bool useDegrees)  { _useDegrees=useDegrees; }
    public double Sin(double x)     { return Math.Sin(_useDegrees?(x/180)*Pi:x); }
    public double Cos(double x)     { return Math.Cos(_useDegrees?(x/180)*Pi:x); }
    public double Pi                { get { return Math.PI; }}
}
void Main()
{
    var context=new BasicEvaluationContext();
    context.AddNonameObject(new MyTrig(true));
    Console.WriteLine("Cos of 45 degrees is "+context.Eval(".cos(45)"));   
}
// Cos of 45 degrees is 0.707106781186548   

Finally, if AccessPrivate returns true, expressions will be allowed to access non-public methods and properties of the objects.

A Few Additional Notes

Case Sensitivity

Variable and object names may be case sensitive or not, depending on the IEvaluationContext implementation. Method and property names, unlike C#, are case insensitive.

Type Conversions

Type casts are much more relaxed than in C#. For example, (string)(bool)0x21 is valid (evaluates to string true), and so is (FileAttributes)'0x32|normal'. Even (char)"a" works, and returns the first character of the string.

Comma and Semicolon Operators

Comma and semicolon have the same meaning as comma operator in C, returning the last value in the list. So a();b();c() calls 3 functions, and returns the result of c().

Multi Expressions

In addition to single expressions, there is also a concept of multi-expression, with syntax like ${a|b|c|=5}. It will return value of variable a if it is defined. If it's not, it will try to get value of b. If that fails too, value of c, and finally an integer 5. This is a convenient way of providing default values to variables.

Character Types and Strings

Strings may be quoted using " (double quote), ' (single quote) and ` (backquote) interchangeably, and resulting type is always string and not char. If char type is needed, explicit conversion must be made like (char)`a` or new string((char)8,21).

Also, there are no escaping characters in the string. To create string with \x08 character, for example, concatenation should be used instead: "AAA"+(char)0x8+(char)0x8+(char)0x8+'BBB'.

Arrays

Array can be created without new, just by using {} block. For example, {1,2,3} evaluates to array of 3 integers, {1,2.4,3} to array of 3 doubles, and {1,'2.4',3} to array of 3 objects.

Alternative Syntaxes for Operators

<, > , & characters are inconvenient and difficult to read when expression is embedded into XML. The following may be used instead:

#OR#||
#AND#&&
#BOR#|
#BAND#&
#BXOR#^
#EQ#==
#NEQ#!=
#LT#<
#GT#>
#LE#<=
#GE#>=
#NOT#!
#NEG#~

For example, a && b and a #AND# b are equivalent.

Dates

Dates can be created by wrapping them into #. For example, #2009-1-2 12:08GMT#.

Assignment

While assignment using '=' to properties and variables is not implemented, it still can be done indirectly. For example:

C#
var context=new BasicEvaluationContext();
context.SetObject("this",context);
context.Eval(@"
    this.SetVariable('x',20);
    this.SetVariable('y',150.2m);
    Console.WriteLine('x+y={0}, and type of result is {1}',$x+$y,($x+$y).GetType());
");
// x+y=170.2, and type of result is System.Decimal

Debug Dumps

There is unary operator ##, which converts the object to human readable output (shortcut to XSharper.Core.Dump.ToDump(value)). For example:

C#
Console.WriteLine(new BasicEvaluationContext().Eval<string>
	("##Environment.GetLogicalDrives()"));
// (string[])  array[4] { /* #1, 01e6fa8e */
//  [0] = (string) "A:\"
//  [1] = (string) "C:\"
//  [2] = (string) "D:\"
//  [3] = (string) "E:\"
// }

Please see a dedicated article for more details.

Performance

Performance was not a priority so far. However, it was still interesting to at least get a ballpark figure how slow it is compared to advanced interpreters, like modern Internet browsers and PHP. For testing purposes, a rather complex expression is parsed and evaluated 200,000 times (corresponding test scripts are in EvalBenchmarks directory in the sample file):

C#
var context=new BasicEvaluationContext();
context.Variables["v_t"]="T";
IOperation tree=null;
var timer=System.Diagnostics.Stopwatch.StartNew();
int loops=200000;
for (int i=0;i<loops;++i)
{
    tree=new Parser().Parse(new ParsingReader(@"
                    ( ( $v_t == 'B' ) ? 'bus'.Length : 
                      ( $v_t == 'A' ) ? 'airplane'.Length+10 : 
                      ( $v_t == 'T' ) ? 'train'.Length+100 : 
                      ( $v_t == 'C' ) ? 'car'.Length +1000: 
                      ( $v_t == 'H' ) ? 'horse'.Length +10000: 
                      'feet'.Length+100000 );"));   
}
Console.WriteLine("Parsing took {0}, or {1} parsings per second",
	timer.Elapsed, (long)(loops/timer.Elapsed.TotalSeconds) );

timer=System.Diagnostics.Stopwatch.StartNew();
string res=null;
var stack=new Stack<object>(); 
for (int i=0;i<loops;++i) 
{ 
    tree.Eval(context, stack); 
    res=Utils.To<string>(stack.Pop());
}   
Console.WriteLine("Evaluation took {0}, or {1} evaluations per second",
	timer.Elapsed, (long)(loops/timer.Elapsed.TotalSeconds) );
Console.WriteLine("Result="+res);   

// Results on a single core of Intel Q6600 in 64bit mode:
// Parsing took 00:00:18.0900243, or 11055 parsings per second
// Evaluation took 00:00:01.6310857, or 122617 evaluations per second
// Result=105     

Testing showed that the parser, when compiled to x64, is 2-5 times slower than Chrome/Firefox/IE8 (IE8 is the fastest, 2500ms vs 10000+ms in Chrome4), and execution is 5-10 times slower (Chrome is the fastest, 170ms). Compared to PHP 5.2.3, the engine is about 3 times slower parsing, and execution about 6 times slower.

Interestingly, compiling to x86 instead of AnyCPU/x64 speeds up parsing by whopping 250% , and parsing becomes even faster than Chrome4 and pretty much on par with PHP (with the test expression above). Execution speed is also faster by about 20%. Apparently optimizations in the x64 version of .NET 3.5 JIT are not as good as in x86 version.

In conclusion, considering that highly optimized engines written in native code were compared to a non-optimized library written in .NET + reflection, the library seems to be doing reasonably well. Its performance (in x86 mode particularly), with tens/hundreds of thousands of expressions parsed/evaluated on a single core of Intel Core CPU per second, should be adequate for many purposes.

To put things into perspective, strongly typed compiled .NET code evaluating the same conditional expression is 300-500 times faster. So when execution speed is critical and expressions are evaluated millions of times, it would be much better to avoid interpretation altogether. Compile frequently used expressions into a .NET assembly using CSharpCodeProvider class (even if it launches C# compiler behind the scenes for half-a-second), or converts generated expression tree into .NET 3 Expression tree and compiles it, and so on.

History

The latest version of the library (either as Eval library in samples, or complete XSharper.Core library) can be downloaded from XSharper.com.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer http://xsharper.com
Canada Canada
Putting code between curly braces for centuries. Lately, between curly and angle braces too, on http://xsharper.com .

Comments and Discussions

 
GeneralMy vote of 5 Pin
rapid2k22-Dec-10 10:38
rapid2k22-Dec-10 10:38 
GeneralAwesome... Keep it up Pin
Kunal Chowdhury «IN»22-Apr-10 7:20
professionalKunal Chowdhury «IN»22-Apr-10 7:20 
GeneralRe: Awesome... Keep it up Pin
Sandeep Mewara30-May-11 5:58
mveSandeep Mewara30-May-11 5:58 
GeneralRe: Awesome... Keep it up Pin
Kunal Chowdhury «IN»30-May-11 6:00
professionalKunal Chowdhury «IN»30-May-11 6:00 
GeneralRe: Awesome... Keep it up Pin
Sandeep Mewara30-May-11 7:55
mveSandeep Mewara30-May-11 7:55 
GeneralNice one... Pin
Sandeep Mewara21-Apr-10 19:12
mveSandeep Mewara21-Apr-10 19:12 
GeneralRe: Nice one... Pin
Kunal Chowdhury «IN»30-May-11 6:01
professionalKunal Chowdhury «IN»30-May-11 6:01 
GeneralRe: Nice one... Pin
Sandeep Mewara30-May-11 8:04
mveSandeep Mewara30-May-11 8:04 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.