NumberParser

User 6679439

Rate me:

5.00/5 (5 votes)

2 Dec 2017CPOL10 min read

14.6K

171

Library extending the .NET numeric support

Download FlexibleParser-NumberParser_1.0.8.5.zip - 310.6 KB

Introduction

NumberParser simplifies the usage of .NET numeric types, further maximises the decimal high precision and extends the default mathematical support.

Additionally, this library allows to deal with beyond-double-range values and manages all the errors internally without throwing exceptions.

NumberParser is the second part of FlexibleParser, a multi-purpose group of independent .NET parsing libraries (the first part in codeproject.com: UnitParser).

This article refers to NumberParser v. 1.0.8.5 (stable).

Note that I have also developed a Java version of this library. To know more about it, visit the corresponding page in my main site.

Background

There are three main aspects of the default .NET management of numeric types which are relevant for NumberParser:

It supports various numeric types under different constraints which aren’t immediately and fully compatible among each other.
NumberParser removes all the boundaries among numeric types via either a dynamic variable or a class simultaneously supporting various native types. Additionally, its defining structure Value (any type) * 10^BaseTenExponent (int) allows dealing with as big numbers as required.
The System.Math methods are reasonably comprehensive, but certainly improvable.
The Math2 class of NumberParser includes adapted-to-NumberParser-classes versions of all the System.Math methods and custom approaches extending the default .NET functionalities.
The high precision of the .NET decimal type isn’t always fully maximised.
The methods Math2.PowDecimal and Math2.SqrtDecimal rely on a custom exponentiation approach precisely meant to maximise the decimal precision.

Code analysis

Beyond individual numeric types: NumberX

NumberX is a generic designation for the classes Number, NumberD, NumberO and NumberP, which provide the basic conditions allowing NumberParser to accomplish the intended homogenisation of numeric types. All these classes share the following features:

Defined by Value*10^BaseTenExponent and, consequently, supporting bigger than enough ranges.
All the errors being managed internally and no exceptions are thrown.
Intuitive basic arithmetic and comparison operations support (operator overloading).
Implicitly convertible between each other and to their defining native types.

All these classes have their own specific characteristics, namely:

Number. It is the lighter one and its Value is decimal.
NumberD. Its Value is dynamic and, consequently, it can deal with any native numeric type.
NumberO. Its defining characteristic is the Others public property, a collection of NumberD variables containing the native numeric types instructed by the user.
NumberP. It can extract numeric information from strings and its Value is dynamic.

All these classes are defined according to their own codes, included inside clearly-tagged files and folders like Constructors_Number.cs (inside the Constructors folder) or Operations_Public_NumberD.cs (inside the Operations/Public folder). All the code dealing with the common parts relies on the lightest possible version: Number for decimal-based calculations and NumberD for any other scenario.

Below these lines, I am including and excerpt of the NumberD constructor code. It gives a quite good idea about how most of this part of the code looks like: some public properties automatically synchronised through the getters/setters; and a relevant number of constructors (where the aforementioned properties are populated in a compatible-with-that-synchronisation order) allowing to instantiate these classes in many different ways and, in the cases of 1-argument constructors, also implicitly converting them to other NumberX classes/natives types.

///<summary>
///<para>NumberD extends the limited decimal-only range of Number by supporting all the numeric 
///types.</para>
///<para>It is implicitly convertible to Number, NumberO, NumberP and all the numeric types.</para>
///</summary>
public partial class NumberD
{
    private dynamic _Value;
    private Type _Type;
    <summary><para>Numeric variable storing the primary value.</para></summary>
    public dynamic Value
    {
        get { return _Value; }
        set
        {
            Type type = ErrorInfoNumber.InputTypeIsValidNumeric(value);
           
            if (type == null) _Value = null;
            else
            {
                _Value = value;
                if (_Value == 0) BaseTenExponent = 0;
            }
           
            if (Type != type) Type = type;
        }
    } 
    ///<summary><para>Base-ten exponent complementing the primary value.</para></summary>
    public int BaseTenExponent { get; set; }
    ///<summary><para>Numeric type of the Value property.</para></summary>
    public Type Type
    {
        get { return _Type; }
        set
        {
            if (Value != null && value != null)
            {
                if (Value.GetType() == value) _Type = value;
                else
                {
                    NumberD tempVar = new NumberD(Value, BaseTenExponent, value, false);
                    
                    if (tempVar.Error == ErrorTypesNumber.None)
                    {
                        _Type = value;
                        BaseTenExponent = tempVar.BaseTenExponent;
                        Value = tempVar.Value;
                    }
                    //else -> The new type is wrong and can be safely ignored.
                }
            }
        }
    }
    ///<summary><para>Readonly member of the ErrorTypesNumber enum which best suits the current 
    ///conditions.</para></summary>
    public readonly ErrorTypesNumber Error;
    
    ///<summary><para>Initialises a new NumberD instance.</para></summary>
    ///<param name="type">Type to be assigned to the dynamic Value property. Only numeric types 
    ///are valid.</param>
    public NumberD(Type type)
    {
        Value = Basic.GetNumberSpecificType(0, type);
        Type = type;
    }
     
    ///<summary><para>Initialises a new NumberD instance.</para></summary>
    ///<param name="value">Main value to be used. Only numeric variables are valid.</param>
    ///<param name="baseTenExponent">Base-ten exponent to be used.</param>
    public NumberD(dynamic value, int baseTenExponent)
    {
        Type type = ErrorInfoNumber.InputTypeIsValidNumeric(value);
        
        if (type == null)
        {
            Error = ErrorTypesNumber.InvalidInput;
        }
        else
        {
            //To avoid problems with the automatic actions triggered by some setters, it is 
            //better to always assign values in this order (i.e., first BaseTenExponent, then 
            //Value and finally Type).
            BaseTenExponent = baseTenExponent;
            Value = value;
            Type = type;
        }
    }
    
    ///<summary><para>Initialises a new NumberD instance.</para></summary>
    ///<param name="value">Main value to be used. Only numeric variables are valid.</param>
    ///<param name="type">Type to be assigned to the dynamic Value property. Only numeric types 
    ///are valid.</param>
    public NumberD(dynamic value, Type type)
    {
        NumberD numberD = ExtractValueAndTypeInfo(value, 0, type);
        
        if (numberD.Error != ErrorTypesNumber.None)
        {
            Error = numberD.Error;
        }
        else
        {
            BaseTenExponent = numberD.BaseTenExponent;
            Value = numberD.Value;
            Type = type;
        }
    }

    //etc.
}

To know more about the NumberX implementations, you can visit the corresponding pages in varocarbas.com: https://varocarbas.com/flexible_parser/number_numberx/, https://varocarbas.com/flexible_parser/number_numbero/ and https://varocarbas.com/flexible_parser/number_numberp/.

Basic operations and comparisons between NumberX instances

The next logical step after creating the NumberX classes is to ease its usage under the most common scenarios which, for numeric types, are basic arithmetic and comparison operations.

My approach has been to perform all these actions via operator overloading for every class what, for example, allows to do something like NumberD result = new NumberD(1.2345) + new NumberD(5555);. In all the NumberX classes, the basic arithmetic (+, -, * and /) and comparison (==, !=, >, >=, <, <=) operators are overloaded.

The aforementioned operations are expected to be performed between instances of the same NumberX class. Or, in other words, the implicit NumberX conversions aren’t applicable when dealing with overloaded operations. For example, new NumberD(567) * new NumberP("12.3") is wrong, but not new NumberD(567) * (NumberD)new NumberP("12.3"). The same rules aren’t applicable to implicit conversions of native natives and that's why new Number(987.6m) + 777m is fine.

The aforementioned limitation is provoked by the dynamic type peculiarities. The only way to avoid the current errors (i.e., ambiguous determination of the NumberX class to be used) would have been to expressly overload all the possible combinations between NumberX classes. Implementing such an eventuality was never an option because it would have provoked an unreasonable increase of the code size and resources associated with the NumberX classes, what would have had a relevant negative impact on their performance. Doing all that just to accomplish the irrelevant goal of avoiding a cast under very specific conditions wouldn't have made too much sense.

As part of FlexibleParser, NumberParser relies on the same default assumptions than all the other parts and, in case of incompatibility (e.g., different NumberX class or different Value types), the first element starting from the top left will always be preferred.

The most important parts of the code dealing with all this are the following:

Operations/Public folder. All the method/operator overloads and implicit conversions (i.e., calling the corresponding 1-argument constructor) for all the NumberX classes are included here.
Operations/Private folder. It contains most of the internal resources used by the aforementioned public resources. Note that one of these files (Operations_Private_Managed.cs) contains an adapted version of the managed operations discussed in the article about UnitParser.
Conversions folder. As far as all the NumberX classes have to be able to undistinguishedly deal with different numeric types, conversions are also closely related to basic operations. In any case, bear in mind that all these are only-errors-if-required custom conversions adapting native types to the NumberX format rather than standard ones between native types. For example, no information is lost when converting an integer like 100000000 to byte because all the excess beyond the maximum byte range is stored in the associated BaseTenExponent.

A descriptive code of the conversion part might be the following one:

private static Number ModifyValueToFitType(Number number, Type target, decimal targetValue)
{
    decimal sign = 1m;
    if (number.Value < 0)
    {
        sign = -1m;
        number.Value *= sign;
    }
    
    if (!Basic.AllDecimalTypes.Contains(target))
    {
        number.Value = Math.Round(number.Value, MidpointRounding.AwayFromZero);
    }
     
    targetValue = Math.Abs(targetValue);
    bool increase = (number.Value < targetValue);

    while (true)
    {
        if (number.Value == targetValue) break;
        else
        {
            if (increase)
            {
                if 
                (
                    number.Value > Basic.AllNumberMinMaxPositives
                    [
                        typeof(decimal)
                    ]
                    [1] / 10m
                )
                { break; }
            
                number.Value *= 10;
                number.BaseTenExponent--;
                if (number.Value > targetValue) break;
            }
            else
            {
                if
                (
                    number.Value < Basic.AllNumberMinMaxPositives
                    [
                        typeof(decimal)
                    ]
                    [0] * 10m
                )
                { break; }
            
                number.Value /= 10;
                number.BaseTenExponent++;
                if (number.Value < targetValue) break;
            }
        }
    }

    number.Value *= sign;

    return number;
}

Overview of Math2 methods

After setting a group of classes homegenising the management of numeric types and all the basic comparisons/operations among them, extending their mathematical support seems the next logical step. In the .NET Framework, the main in-built mathematical methods are stored under System.Math and its NumberParser equivalent is Math2.

There is a first group of Math2 methods which are just NumberX-adapted versions of all the System.Math ones. Each of them delivers exactly the same result than the corresponding original version. Its whole point is to facilitate the usage of NumberX instances with the most common mathematical functionalities. Even the default support (e.g., double range in most of the cases) is being respected and (internally-managed) errors are triggered regardless of the fact that the corresponding NumberX class can deal with these conditions or not.

Below these lines, you can find a descriptive excerpt of this part of the code included in Math2_Private_Existing.cs:

private delegate double Method1Arg(double value);
private delegate double Method2Arg(double value1, double value2);

private static Dictionary<ExistingOperations, Method1Arg> AllMathDouble1 = 
new Dictionary<ExistingOperations, Method1Arg>()
{
    { ExistingOperations.Acos, Math.Acos }, { ExistingOperations.Asin, Math.Asin}, 
    { ExistingOperations.Atan, Math.Atan }, { ExistingOperations.Cos, Math.Cos }, 
    { ExistingOperations.Cosh, Math.Cosh }, { ExistingOperations.Exp, Math.Exp }, 
    { ExistingOperations.Log, Math.Log }, { ExistingOperations.Log10, Math.Log10 }, 
    { ExistingOperations.Sin, Math.Sin }, { ExistingOperations.Sinh, Math.Sinh }, 
    { ExistingOperations.Sqrt, Math.Sqrt }, { ExistingOperations.Tan, Math.Tan }, 
    { ExistingOperations.Tanh, Math.Tanh }
};

private static Dictionary<ExistingOperations, Method2Arg> AllMathDouble2 = 
new Dictionary<ExistingOperations, Method2Arg>()
{
    { ExistingOperations.Atan2, Math.Atan2 }, 
    { ExistingOperations.IEEERemainder, Math.IEEERemainder }, 
    { ExistingOperations.Log, Math.Log }, { ExistingOperations.Pow, Math.Pow }
};

private static NumberD PerformOperationOneOperand(NumberD n, ExistingOperations operation)
{
    NumberD n2 = AdaptInputsToMathMethod(n, GetTypesOperation(operation), operation);
    if (n2.Error != ErrorTypesNumber.None) return new NumberD(n2.Error);

    try
    {
        return ApplyMethod1(n2, operation);
    }
    catch
    {
        return new NumberD(ErrorTypesNumber.NativeMethodError);
    }
}

private static NumberD PerformOperationTwoOperands(NumberD n1, NumberD n2, ExistingOperations operation)
{
    NumberD[] ns = CheckTwoOperands
    (
        new NumberD[] { n1, n2 }, operation
    );
    if (ns[0].Error != ErrorTypesNumber.None) return ns[0];

    try
    {
        return ApplyMethod2(ns[0], ns[1], operation);
    }
    catch
    {
        return new NumberD(ErrorTypesNumber.NativeMethodError);
    }
}

private static NumberD[] CheckTwoOperands(NumberD[] ns, ExistingOperations operation)
{
    ns = OrderTwoOperands(ns);
    
    for (int i = 0; i < ns.Length; i++)
    {
        ns[i] = AdaptInputsToMathMethod
        (
            ns[i], (i == 0 ? GetTypesOperation(operation) : new Type[] { ns[0].Type }), 
            operation
        );
        if (ns[i].Error != ErrorTypesNumber.None)
        {
            return new NumberD[] { new NumberD(ns[i].Error) };
        }
    }
    
    return ns;
}

The Math2 class also includes the following group of custom mathematical methods which I have developed from scratch:

GetPolynomialFit/ApplyPolynomialFit. They calculate the 2^nd degree polynomial fit from a set of X/Y values and apply it to estimate what Y2 is associated with the X2 input.
Factorial. It calculates the factorial of positive integers smaller than 100000.
RoundExact/TruncateExact. These methods appreciably extend the in-built .NET rounding/truncating functionalities. They allow to focus the rounding/truncating actions on the integer/decimal parts and, for example, return 123 or 124 or 123.6 from the input 123.567.
PowDecimal/SqrtDecimal. It is discussed in the next subsection.

The most interesting code is the one dealing with RoundExact/TruncateExact and this is a descriptive sample of it:

private static decimal RoundInternalAfterZeroes(decimal d, int digits, RoundType type, decimal d2, int zeroCount)
{
    if (digits < zeroCount)
    {
        //Cases like 0.001 with 1 digit or 0.0001 with 2 digits can reach this point.
        //On the other hand, something like 0.001 with 2 digits requires further analysis.
        return Math.Floor(d) +
        (
            type != RoundType.AlwaysAwayFromZero ? 0m :
            1m / Power10Decimal[digits]
        );
    }
    
    //d3 represent the decimal part after all the heading zeroes.
    decimal d3 = d2 * Power10Decimal[zeroCount];
    d3 = DecimalPartToInteger(d3 - Math.Floor(d3), 0, true);
    int length3 = GetIntegerLength(d3);
    
    decimal headingBit = 0;
    digits -= zeroCount;
    if (digits == 0)
    {
        //In a situation like 0.005 with 2 digits, the number to be analysed would be
        //05 what cannot be (i.e., treated as 5, something different). That's why, in
        //these cases, adding a heading number is required.
        headingBit = 2; //2 avoids the ...ToEven types to be misinterpreted.
        d3 = headingBit * Power10Decimal[length3] + d3;
        digits = 0;
    }
    
    decimal output =
    (
        RoundExactInternal(d3, length3 - digits, type)
        / Power10Decimal[length3]
    )
    - headingBit;
    
    return Math.Floor(d) +
    (
        output == 0m ? 0m :
        output /= Power10Decimal[zeroCount]
    );
}

To know more about the Math2 methods, you can visit the corresponding pages in varocarbas.com: https://varocarbas.com/flexible_parser/number_native/ and https://varocarbas.com/flexible_parser/number_custom/.

Math2.PowDecimal and Math2.SqrtDecimal

The in-built .NET exponentiation methods are meant to maximise the floating-point peculiarities (double type); what implies that most of the efforts are focused on delivering reasonably accurate results as quickly as possible. Another relevant issue is that the specific implementations are private and, in any case, very unlikely to be easily adapted to non-floating-point scenarios.

Almost all the programming languages rely on floating-point approaches to deal with decimal numeric types. The decimal type in .NET is one of the few exceptions and this is precisely why I had to develop a custom approach to fully maximise its defining high precision. I will only be referring to the implementation dealing with fractional exponents, because accounting for all the other scenario (e.g., integer or negative exponents) is quite trivial.

I relied on the very fast, reliable and brand-new (LOL) Newton-Raphson method. The main limitation of this approach is that its convergence speed is highly conditioned by the fact of providing a good enough first guess. A bad initial guess isn’t particularly influential under not too demanding conditions, but it is extremely relevant when trying to accomplish the intended maximisation of decimal precision. Note that this type can accurately deal with up to 28 decimal positions, what implies performing operations and comparing values within a precision of up to 10^-28. In other words, getting stuck into a real or practically-speaking (i.e., taking unacceptably long) infinite loop is relatively easy unless the initial guess is good enough.

So, the most relevant part of the Math2.PowDecimal/Math2.SqrtDecimal code is the approach with which I came up to ensure good enough first guesses for the Newton-Raphson method. By bearing in mind that it calculates n roots, that the perfect guess is the actual root and that the n root of x has to be more or less consistent with a trend defined by n-1 root of x and n+1 root of x, I generated a relevant number of couples x vs. n root of x for a relevant number of different n values. Then, I looked for the underlying trends, summarised these conclusions and created equations replicating those behaviours within more or less big ranges. Note that all this part is only concerned about dealing with positive integers n and 10-divisible x.

Although the current approach is already reasonably quick and reliable, it is still a first version which I am expecting to further improve at a later point. That is the reason why this part of the code doesn't include too many comments: it is still work in progress. In any case, here you have a descriptive sample of it:

private static decimal GetSmallValueBase10Guess(decimal value, decimal n)
{
    decimal[] vals = new decimal[]
    {
        0.4605m, 0.5298m, 0.5704m, 0.5991m, 0.6215m
    };
    
    int index = (int)(value / 100m);
    decimal ratio = (value - index * 100m) / 100m;
    index--;
    
    decimal outVal = vals[index];
    if (ratio != 1m && index < 4)
    {
        outVal = vals[index] + ratio * (vals[index + 1] - vals[index]);
    }
      
    return 1m + outVal / Power10Decimal[GetIntegerLength(n) - 2];
}

private static decimal GetGenericBase10Guess(decimal value, decimal n)
{
    bool small = false;
    decimal value2 = GetInverseValue(value);
    if (value2 != -1m)
    {
        small = true;
        value = value2;
    }
    
    decimal outVal = 1m;
    int exponent = GetIntegerLength(n) - 1;
    
    if (value >= 500m)
    {
        decimal ratio = value / 500m;
        if (ratio >= 100m)
        {
            exponent--;
            if (ratio >= 1000m)
            {
                int length = GetIntegerLength(ratio);
                //length -> addition
                //4 -> 0.25
                //5 -> 0.5
                //6 -> 0.75
                //7 -> 1
                //8 -> 1.25
                //...
                decimal rem = length % 4;
                outVal = length / 4 + 0.25m * (rem + 1m);
            }
        }
        else if (ratio >= 10m) outVal *= 9m;
        else if (ratio >= 1m) outVal *= 5m;
    }
    
    return
    (
        !small ? 1m + outVal / Power10Decimal[exponent] :
        (1m - 1m / Power10Decimal[exponent]) + outVal / Power10Decimal[exponent + 1]
    );
}

I have written a much more detailed analysis of this implementation in https://varocarbas.com/fractional_exponentiation/ (PDF).

Using the code

NumberParser (inside the FlexibleParser namespace) provides a common framework to deal with all the .NET numeric types. It relies on the following four classes (NumberX):

Number only supports the decimal type.
NumberD can support any numeric type via dynamic.
NumberO can support different numeric types simultaneously.
NumberP can parse numbers from strings.

//1.23m (decimal).
Number number = new Number(1.23m);

//123 (int).
NumberD numberD = new NumberD(123);

//1.23 (decimal). Others: 1 (int) and ' ' (char).
NumberO numberO = new NumberO(1.23m, new Type[] { typeof(int), typeof(char) });

//1 (long).
NumberP numberP = new NumberP("1.23", new ParseConfig(typeof(long)));

Common features

All the NumberX classes have various characteristics in common.

Defined according to the fields Value (decimal or dynamic) and BaseTenExponent (int). All of them support ranges beyond [-1, 1] * 10^2147483647.
Most common arithmetic and comparison operator support.
Errors managed internally and no exceptions thrown.
Numerous instantiating alternatives. Implicitly convertible between each other and to related types.

//12.3*10^456 (decimal).
Number number = new Number(12.3m, 456);

//123 (int).
Number numberD =
(
    new NumberD(123) < (NumberD)new Number(456) ?
    //123 (int).
    new NumberD(123.456, typeof(int)) :
    //123.456 (double).
    new NumberD(123.456)
);

//Error (ErrorTypesNumber.InvalidOperation) provoked when dividing by zero.
NumberO numberO = new NumberO(123m, OtherTypes.IntegerTypes) / 0m;

//1234*10^5678 (decimal).
NumberP numberP = (NumberP)"1234e5678";

Math2 class

This class includes all the NumberParser mathematical functionalities.

Custom functionalities

PowDecimal/SqrtDecimal whose decimal-based algorithms are more precise than the System.Math versions. The whole varocarbas.com Project 10 explains their underlying calculation approach.
RoundExact/TruncateExact can deal with multiple rounding/truncating scenarios not supported by the native methods.
GetPolynomialFit/ApplyPolynomialFit allow to deal with second degree polynomial fits.
Factorial calculates the factorial of any integer number up to 100000.

//158250272872244.91791560253776 (decimal).
Number number = Math2.PowDecimal(123.45m, 6.789101112131415161718m);

//123000 (decimal).
Number number = Math2.RoundExact
(
    123456.789m, 3, RoundType.AlwaysToZero,
    RoundSeparator.BeforeDecimalSeparator
);

//30 (decimal).
NumberD numberD = Math2.ApplyPolynomialFit
(
    Math2.GetPolynomialFit
    (
        new NumberD[] { 1m, 2m, 4m }, new NumberD[] { 10m, 20m, 40m }
    )
    , 3
);

//3628800 (int).
NumberD numberD = Math2.Factorial(10);

Native methods

Math2 also includes NumberD-adapted versions of all the System.Math methods.

//158250289837968.16 (double). 
NumberD numberD = Math2.Pow(123.45, 6.789101112131415161718);

//4.8158362157911885 (double). 
NumberD numberD = Math2.Log(123.45m);

Further code samples

The test application includes a relevant number of descriptive code samples.

Points of interest

User-friendly format allowing to easily deal with all the .NET numeric types without having to worry about conversions or range limitations.

Extension of the default mathematical support with a major focus on maximising the high precision associated with decimal.

It can deal with as big as required numbers and manages all the errors internally.

Authorship

I, Alvaro Carballo Garcia, am the sole author of this article and all the referred NumberParser/FlexibleParser resources like code or documentation.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.