Click here to Skip to main content
15,886,724 members
Articles / Programming Languages / C#

The Hidden Side-effect of Enums and Values

Rate me:
Please Sign up or sign in to vote.
3.55/5 (34 votes)
7 Feb 2019CPOL5 min read 20.6K   7   26
Recently, I encountered an issue with enums, which I wanted to share in case someone else encounters it along the way.

Introduction

Recently, I encountered an issue with enums, which I wanted to share in case someone else encounters it along the way.

Disclaimer: So that there is minimal miss-intepretations on this article, this is not a guide as how you should write enums, this is an article showing what the compiler allows to happen and it is meant as a way to spot the issue if you ever encounter it in code written by someone else or third party systems.

So what are enums?

Enums are a list of numerical constants that help us in a number of situations, like for example when something can have a tag or a property on it to distinguish it from another object, or they represent different options for methods. Other times, they represent states of an object or relationships like for example, we can have an Employee enum which tells us if a person is a manager, vice-president, CEO.

But no matter what name we give an enum, in the background, it is just a number used to represent that state like this:

C#
enum Foo {
Bar1,
Bar2,
Bar3,
}

So, in this case, we declared an enum called Foo which can have one of 3 values, Bar1, which implicitly has a value of 0, Bar2, which has a value of 1 and Bar3 which has a value of 2.

The issue lies in the background value of the enum and how we use it, for example since we know that Bar2 has a value of 1, we could cast the number 1 to an enum of type Foo and we will get Bar2 like this:

C#
Console.WriteLine((Foo)1);

// Will output: Bar2

But since we are talking about numbers, enums can also be mapped to a specific value like this:

C#
enum Foo {
Bar1 = 2,
Bar2,
Bar3 = 5,
}

Basically, in this case, Bar1 will have a value of 2, Bar2 will have a value of 3, and Bar3 will have a value of 5.

And now for the odd part and side-effect.

We can have an enum defined with two or more identifiers for the same value like so:

C#
enum Foo {
Bar1 = 2,
Bar2 = 2,
Bar3,
}

Notice that Bar1 and Bar2 have the same value (doesn’t have to be 2). So if we now run the following command, the run-time does not know to which identifier we are referring to so it will give the middle identifier with that value:

C#
Console.WriteLine((Foo)2);

// Will output: Bar2 because it is the latest

What I mean by the middle is that the output will be the same even if we have an enum defined like this:

C#
enum Foo {
Bar1 = 2,
Bar2 = 2,
Bar3 = 2,
}

So no matter how we run it, the output will still be Bar2 but if we have an enum defined like this:

C#
enum Foo {
Bar1 = 2,
Bar2 = 2,
Bar3 = 2,
Bar4 = 2,
Bar5 = 2,
Bar6 = 2,
Bar7 = 2
}

Running the same command will give us Bar4 because it’s the middle one, and if we have an even number of enums, it will give us the middle one closer to the end, so for two enum identifiers, it will give us the second one, for three it will give us the second one, but for four it will give us the third one, and again if we had five identifiers for the same value. For six, it will give us the third one, and so on and so forth.

But what happens when putting another enum identifier with a lower value before Bar1 like this?

C#
enum Foo {
Bar,
Bar1 = 2,
Bar2 = 2,
Bar3,
}

Now if we run the output command, it will not show the middle enum with that value, instead, it will show the middle value – 1 so in this case, it will show Bar1 and for the previous enum, but let us take it a step further and see for this one:

C#
enum Foo {
Bar,
Bar1 = 2,
Bar2 = 2,
Bar3 = 2,
Bar4 = 2,
Bar5 = 2,
Bar6 = 2,
Bar7 = 2
}

Then the output for the value 2 will be Bar3, and even worse, if we were to add 2 more values before Bar1 (I shifted the value to 5 so that we don’t overlap with the ones we’re trying to check:

C#
enum Foo {
Bar0,
Bar00,
Bar000,
Bar1 = 5,
Bar2 = 5,
Bar3 = 5,
Bar4 = 5,
Bar5 = 5,
Bar6 = 5,
Bar7 = 5
}

Then if we run the command:

C#
Console.WriteLine((Foo)5);

// Will output: Bar2

So for every 2 identifiers added before that sequence, it will go back one, but I tried something else and added 2 more identifiers after the sequence, and guess what, it went back to Bar3 and with another two, it went to Bar4 and if you keep going and adding so many identifiers that it should be more than Bar7, then it will cycle around and show Bar1.

This, I admit, baffled me a bit, because that means that using enums by value when we have more than one identifier for a given value it becomes unpredictable, especially when this isn’t clear and during the course of development, we add to that enum without knowing that it affects us and we might have outputs that depend on those identifiers.

Even though we won’t see all that many cases with more than two identifiers per value, it is still something to take note of because using enums by value is not that uncommon, and by that I mean that at least three common usages come to mind when I think of this, like HTML drop-downs which have number values behind them, WebAPI calls that use numbers to denote a certain enum value, databases persistence, like MongoDB will use the numerical value to store an enum, and I’m sure there are many more cases that use such mechanisms.

Fortunately, a colleague of mine came up with an answer to avoid this issue and that is to save or send enum values as text and then parse them, that way, we know for sure that we are referring to the right identifier.

I hope you found this as interesting and weird as I did, and if you know the reason why this happens, feel free to share and let me know because I admit, my curiosity has peaked.

Thank you and see you next time.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer
Romania Romania
When asked, I always see myself as a .Net Developer because of my affinity for the Microsoft platform, though I do pride myself by constantly learning new languages, paradigms, methodologies, and topics. I try to learn as much as I can from a wide breadth of topics from automation to mobile platforms, from gaming technologies to application security.

If there is one thing I wish to impart, that that is this "Always respect your craft, your tests and your QA"

Comments and Discussions

 
QuestionBad idea to give two (or more) times the same value Pin
Bertrand Gilliard11-Feb-19 5:18
Bertrand Gilliard11-Feb-19 5:18 
QuestionGood point - would not happen though Pin
Sammuel Miranda11-Feb-19 2:27
professionalSammuel Miranda11-Feb-19 2:27 
AnswerInsight as to how this might happen Pin
Vlad Neculai Vizitiu28-Jan-19 10:04
Vlad Neculai Vizitiu28-Jan-19 10:04 
News[My vote of 2] Mathematically determined. Pin
stixoffire22-Jan-19 2:15
stixoffire22-Jan-19 2:15 
GeneralRe: [My vote of 2] Mathematically determined. Pin
Heriberto Lugo9-Feb-19 20:44
Heriberto Lugo9-Feb-19 20:44 
GeneralMy vote of 1 Pin
Imirzyan18-Nov-18 19:43
Imirzyan18-Nov-18 19:43 
GeneralRe: My vote of 1 Pin
George Gomez7-Feb-19 12:25
George Gomez7-Feb-19 12:25 
GeneralRe: My vote of 1 Pin
Vlad Neculai Vizitiu7-Feb-19 12:34
Vlad Neculai Vizitiu7-Feb-19 12:34 
Suggestionmapping an enum to a string is implementation-dependant Pin
Philippe Verdy16-Nov-18 7:43
Philippe Verdy16-Nov-18 7:43 
GeneralRe: mapping an enum to a string is implementation-dependant Pin
Chad3F16-Nov-18 13:52
Chad3F16-Nov-18 13:52 
GeneralRe: mapping an enum to a string is implementation-dependant Pin
Philippe Verdy16-Nov-18 15:48
Philippe Verdy16-Nov-18 15:48 
One way to define "safe" enums with two names for the same value, would just be to define them as

enum{a, b, c; const first = a, last = c}

which defines only "a", "b", "c" as "canonical" values (that have predictable names), and defining "first","last" only as aliases (whose name will not be returned when querying names from an enum alue that can only return "a", "b", or "c") which share the same equal value (i.e. the same ordinal); here the semicolon instead of the comma, or the const keyword is enough to say that we are defining an alias.

This allows defining a safe arithmetic restricted only in {a,b,c} (whose result is restricted to that unique set or will cause a predictable overflow exception, for example a-1 or c+1 would unconditionally overflow, and a+1 would still give b, and b+1 would still give c even if it is also equal to the alias named "last").

But assigning random numeric values to constants declared in an enum causes various problems: we cannot safely define a "first" and "last" element, and not easily define an ordinal if we allow the declared enum constants to create "holes" between elements of the ordered set (i.e. they are assigned in non consecutive ranges), and so we cannot safely define any arithmetic on them as all members of these sets can overflow.

You can only define a *single* numeric constraint on one of the defined constants (for example the first one can be set to 0 or 1 or 1000, it does not matter, all other members are assigned to create a unique sequence of consecutive integers). So this declaration is safe:

enum{a=1000, b, c}

But not this one, even if there's no pair of defined constants that are given the same integer value:

enum{a=1, b=10, c}

because (a+1) is not part of the set but a is not the highest value (i.e. not the last one), and because (b-1) is also not part of the set but is also not the smallest value (i.e. not the first one).

A compiler however may infer default names such as "a", "(a+1)" (or just "2" in that last example) for the undefined value a+1, and so on up to "(a+9)" (or just "9"), just before "c"; it won't cause any overflow exception, and the defined set above would actually contain 11 distinct constants each one with a distinct name as well.

And then we could define valid restricted integer types like:

enum{min=-100, max=100}

containing 201 constants from -100 to 100 inclusively (using a modulo 201 arithmetic not requiring any overflow checks). So we could define:

typedef enum{min=-128, max=127} int8_t;

(such definition defining a strict set with 256 distinct values would precisely perform an arithmetic modulo 256); or:

typedef enum{zero=0, max=9} decimaldigit_t;

(such definition defining a strict set with 10 distinct values would precisely perform an arithmetic modulo 10, whose constants are named "zero", "(zero+1)", ..., "(zero+8)", "max", or just named "zero", "1", ... "8", "max": these names can safely be returned by a synthetic default static method generated by the compiler, that converts an enum value to a string showing the canonical names of defined constants, necessarily starting with a letter or underscose, or otherwise showing just their numeric values if no name is defined for other constants that are also part of the defined enum set).

Whever the compiler will generate an unchecked "modulo N" arithmetic or a checked "bounded" arithmetic could also be an option for the defined enum type so that:

typedef enum {
zero=0, max=9
} catch(i) {
throw(new Error("decimal digit overflow %d", i));
} strictdecimaldigit_t;


would throw overflow exceptions if the result of an arithmetic causes out-of-range values, but the default "modulo N" arithmetic could be also changed, for example to add a carry:

typedef enum {
min=0, max=9;
} catch(int i) {
const int N = max - min + 1;
return (i - min) / N + (i - min) % N + min; // this value is checked again by the catcher!
} carryingdecimaldigit_t;


Note also that instead of defining this catcher, you may want to define a constructor (from an integer type) for the enum type. But the semantic is a bit different and both mayt be used simultaneously in the definition of the enum type:

- If there's a constructor, the enum value returned by the constructor will be used, otherwise if there's a catcher defined, it will be used (see below), otherwise a default synthetic "modulo N" method will be used.

- When the constructor returns a value, its value is not returned immediately as is: if there's a catcher defined for the type, then the value is checked and if it falls out of range, then the catcher is invoked to fix it.

- When a catcher is invoked, its integer return value will be used to invoke the constructor if there's one (see above), otherwise it will be fixed by the default synthetic "modulo N" catcher (in current implementations of enum types in C/C++/C#, this default synthetic "modulo N" catcher uses a value of N which is some power of 2 not clearly defined (but usually it is 2^8 if the enum type is represented as a byte, so the value range is not completely restricted to the strict range going from the minimum to the maximum values defined in the enum, but to a wider unspecified range).

The value of N is just sufficient to hold all the declared numeric values distinctly, but not minimal (when you declare for example an enum{a,b,c} with 3 distinct values, the compiler may use N=256 instead of N=3); this however allows faster code because the "modulo N" checker actually does not generate any code at runtime, the compiler just silently truncates some unnecessary bits when storing values, without performing any actual check, so the results are Ok to preserve distinction, but not good enough to create a safe arithmetic (this makes impossible to define a safe "enumerator" to iterate over all constants actually defined in the enum type, and a "switch(enumvalue)" in the code of the enumerator-based loop should always include a "default" after listing cases only on defined enum constants, to handle other undefined/anonymous constants that are part of the declared enum type).

The compiler should check that the code handles these omitted cases properly, signaling missing "default" in "switch" (even if the arbitrarily chosen value N is minimal, for example in enum{a,b,c,d} and the compiler chooses N=4, storing only 2 bits per value, because other compilers may as well choose N=256, storing 8 bits per value).

Enum types should also modify the integer type promotion rules in expressions, for example:

- (-enum) or (+enum) returns a value of the same enum type (first, the enum is promoted to an int, then the expression is evaluated, then the value is passed through the declared enum constructor, and its declared "catcher")
- (enum + int) returns a value of the same enum type (same algorithm)
- (int + enum) silently promotes the enum to an int and evaluates the expression as an int without using any constructor or catcher.
- (enum >> int) returns a value of the same enum type.

Ideally the same promotion rules should be used between other distinct numeric types (char, short, int, long, long long, float, double, long double and signed variants) using inference on the left-most operand, so that:

- (int + long) is an int
- (long + int) is a long
- (int >> long) is an int
- (long >> int) is a long
- (int + float) is an int
- (float + int) is a float
- (char + unsigned char) is a char
- (unsigned char + char) is an unsigned char
- and so on...

This also means that binary arithmetic operators must NOT be commutative, when operands are not the same integer type, all would be strictly driven by the type of the left-most operand; but this would change the existing promotion rules in C/C++ for basic numeric types; this also changes the associativity and then requires a precise evaluation order, so that "a+b+c" must be evaluated only as ((a+b)+c) but not as (a+(b+c)): this associativity is possible only if operands are the same numeric type (i.e. with the same declared range and precision for its values).

And then we could as well define an enum type for non-integers (here based on declaration of numeric "double" or float values:

- typedef enum{min=0.0, max=1.0} drate_t;
- typedef enum{min=0.0f, max=1.0f} frate_t;

The following would be either invalid, or would promote the numeric values to the same numeric type:

- typedef enum{min=0, max=1.0} drate_t; // same as before: 0 is promoted to 0.0
- typedef enum{min=0, max=1.0f} frate_t; // same as before: 0 is promoted to 0.0f

Another interesting declaration:

typedef enum {
min = 0, max = 100.0f; const pi = 3.14f
} catch (int i) {
return (i <= min) ? min : (i >= max) ? max : (float)i;
} catch (float f) {
return (f <= min) ? min : (f >= max) ? max : (float)math.floor(f * 10.0f + 0.5f) / 10.0f;
} estimate_t;


This last declaration defines a strict enum type with exactly 1001 distinct numeric values {0.0f, 0.1f, 0.2f, ..., 99.9f, 100.0f} which are "capped" between min and max (no modulo N) and rounded.
The declaration of the "estimate_t::pi" constant (as an alias, not as an additional value of the set) actually gives it exactly the numeric value 3.1f (assigning numeric values to declared constants passses them through the declared constructor if there's one, or throught the declared "catchers", both of which enforcing the arithmeitc rules.

----

Another interesting case:

typedef enum {'A', 'Z'} capital_t;

This would also be a valid declaration: you are not required to name distinctly the constants that are part of the declared numeric type. All that is enough is that any variable declared with that enum type (which is based on any basic numeric type of the language), must be able to store distinctly all the constants between the lower bound and upper bound of constants declared in the enum. Here it would declare a type large enughand precise enough to hold one of the 26 constants between 'A' and 'Z' inclusively.

So as well the declarations below would be valid:

typedef enum {'A', 'Z', 'A'} capital_t;
typedef enum {
char::min, char::max,
(unsigned char)::min, (unsigned char)::max,
(signed char)::min, (signed char)::max
} anychar_t;


The declared constant values don't need to be unique; the compiler determines itself the lower and upper bounds of the type, and the minimum precision needed to store the relevant differences and allocates enough bits, determining itself the basic numeric type to use for the values; and no constant need to be named explicitly.

Ideally however the compiler should automatically declare two constant names for the bounds, such as __min and __max, and possibly the cardinality of the set, such as __prec for the minimum precision (given in one of the basic numeric types, including long long or long double) as the estimate of the base-2 logarithm of the number of distinct values between these bounds, and __size (or just sizeof) for the actual precision stored(these precisions will be given in bits so that __prec <= __size, and (2^__size) is the value of "N" for the default ''modulo N'' catcher synthetically generated).

So for

typedef enum {'A', 'Z', 'A'} capital_t;,

we would have:

capital_t::__min == 'A' (which is a constant part of the declared type),
capital_t::__max == 'Z' (which is a constant part of the declared type),
__prec<capital_t> == math.log2(__max - __min + 1) (which is a constant in a basic floating point numeric type, roughly equal to 4.75488750216 here)
__size(capital_t) == 5, (which is a constant in a basic integer numeric type);
sizeof(capital_t) == 1 (assuming that a single "char" can hold all 5 bits needed to store distinct constants from 'A' to 'Z' and that sizeof(char) == 1 which generally means at least 8 bits;

As well we would have:

anychar_t::__min == (signed char)::min (which is a constant part of the declared type, generally -128),
anychar_t::__max == (unsigned char)::max (which is a constant part of the declared type, generally 255),
__prec<anychar_t> == math.log2(__max - __min + 1) (which is a constant in a basic floating point numeric type, generally roughly equal to 8.5849625007211561 here)
__size<anychar_t> == 9, (which is a constant in a basic integer numeric type, but this could be equal to 16 instead of 9);
sizeof(anychar_t) == 2 (assuming that sizeof(char)=1)

Note that __prec is given as a logarithm instead of giving the real __cardinality directly (the value of ''N'' described above, because the cardinality of the set may not not expressible for all numetic types such as "long long" and "long double", as a constant of one of the basic numeric types, without causing an overflow (notably for "long double" where __prec=80, and ''N'' could be N=2^80 and its inverse exceeds the actual epsilon separating non-infinite and non-NaN values).

Other numeric type properties could also be infered as additional constants (not values in the declared type itself), such as the number of distinct NaN values, the number of distinct infinite values, the number of distinct zero values, the number of distinct denormal values, and a type constant giving the infered native numeric type:

__type<anychar_t> == short (if __size<anychar_t> == 9 or 16)
__type<enum{'a','z'}> == char (if __size<enum{'a','z'}> == 8)

Also __prec is not determining directly the step that allows enumerating all distinct values in the defined type (e.g. for integers you can enumerate them by adding 1, but for floatting points the additive step depends on the magnitude of each enumerated value, and there are special steps to enumerate negative and positive zeroes, or denormal values, or signaling and non-signaling NaNs, or positive and negative infinite values).

Also for this reason, the compiler should also automatically declare default forward and default backward enumerators for the declared enum type, which you can instanciate from any enum value and then call once to get the previous and next distinct value.

With all these, we no longer need any preprocessor defines to know the limits of any type (not even native numeric types). All numeric types, including native ones are declared explicitly as enum types in <stdtype>, so macros defined in <limits.h> are deprecated.

We can also view enums like a typesafe version of unions and also allow declaring an enum like this:

typedef enum{(value1), (value2), (value3), (value4)} generictype_t;

The idea is here not to define constants, but create a type that can hold any sample values listed and that are comparable (so that we can define an full order between them and know if they are equal), without having to list all possible values. For example:

typedef enum{100, 200, "x", "y"} generictype_t;

are theoretically possible to create a type storing integers or strings if we also have a full order between them: here this is a type that will include either integers between 100 and 200 (these can only be these two), or strings between "x" and "y" (so including also "x0", "x1", "x11", "xy", "xyz"...) (note: the set has no cardinality, we now the number of possible integers, but not the number of strings, we can only know the number of possible distinct pointers/references according to limits of pointers, i.e. the pointer size in bits.)

The compiler will automatically infer a distinctive tag value when necessary and store that tag value if there's more than one tag. In this example, a tag=0 will be used for integers (0 to 1) and tag=1 will be used for strings (between "x" and "y").

It will generate the set of tags automatically using synthetic constructors like this:

generictype_t(int) : tag(0) {};
generictype_t(string) : tag(1) {};

(these constructors only specify the distinctive tag, not the value which is assigned automatically).

You assign a declared variable of that type normally, without having to specify the tag:
generictype_t x = 102;
and you can then query the tag of any value in that typed variable:
int t = tag<x>; (sets t = 0)
For this the declared type automatically builds a synthetic static method for that enum type...

Then you can specify also tags explicitly in the declaration of the enum (if two distinct values declared in the set have the same tag, they will be stored as an union and no way to distinguish them by the tag, only by their distinct value:

typedef enum{ 100: 0, 200: 0, 300: 1, 400: 2} t;

(this enum contains values between 100 to 200, or equal to 300, or to 400, in three subsets with tags 0, 1, or 2)

Each subset, i.e. each distinct tag value, has its own minimum and maximum bounds, its own size in bits, its own cardinality. The compiler has now 3 declared tags, and the set of tags is also an enum type declared implicitly.

This allows replacing unsafe type declarations like:

typedef enum {int_tag, double_tag, string_tag} tag_t;
typedef struct {
  tag_t tag;
  union {
    int int_val;
    double double_val;
    string string_val;
  }
} variant_t;


by:

typedef enum {<int>, 10, <double>, <string>} variant_t;

(the tag values are assigned automatically by the compiler; tag=0 for int values, tag=1 for double values, tag=2 for string values; here instead of specifying examplare constant values of each type, we just cite their typename between angle brackets, but even if we add examplar values like 10 in this example, as it matches the <int> type also declared, it does not add another tag value and the compiler can discard it; the order of declaration of members of the enum is significant if they are different types).

We could also declare the tag values ourself:

typedef enum {<int>: 'I', <double>: 'D' , <string>: 'S'} variant_t;

and the distinctive tag values will have a char datatype. The enum declaration does not create a new type for tags; if needed types for tag values can be declared separately:

typedef enum {'I', 'D', 'S'} tag_t;
typedef enum: tag_t {<int>, <double>, <string>} variant_t;

(here the compiler assign tags with values taken by enumerating the given "tag_t" type, instead of enumerating "int" by default).

or by using declared constant names given in the tag type:

typedef enum {Int: 'I', Double: 'D', String: 'S'} tag_t;
typedef enum {<int>: (tag_t::Int), <double>: (tag_t::Double), <string>: (tag_t::String)} variant_t;

Here the tags are given a char datatype, but it could also be a string, giving its distinctive name or description:

typedef enum {<int>: "this is an integer", <double>: "this is a floatting point number", <string>: "this is a name"} variant_t;

When the tag type given is a string, it can be used by the synthetic default toString() method when showing the actual value like this:

variant_t::toString() {
return new string( tag<*this>, ':', value<*this>.toString() );
}


We an also select one of the tag subtypes:

variant_t<int> (because <int> is a member type declared in this enum)

to create explicit type conversion (typecast) of the value with some defined method if needed (the effect of that explicit method will be to generate a new enum value with the new tag value, for example converting an enum value with value type <double> into another enum value of the same enum type but with value type <int>).

The compiler can make lot of typesafe inference and generate the optimal storage, reducing the number of bits needed for storing each tag (or not storing it at all if the declared enum has only one tag) if we don't specify ourself a specific type for the tag. In all cases, it will build itself the synthetic code for the static property tag<variant_t>...

No more need of any unsafe unions, including with complex datatypes within unions, no more need to name each member of the union, type inference determines the correct member and sets the tag value properly and implicitly when we set the actual value of an enum variable !

We can even imagine a language that predefines absolutely NO native datatype, all datatypes being declared by an enum declaration (starting by definining them '''only''' with constants supported by the language parser, like: false, true, nil, 10, 3.14, 1.23e45, 'A', "AAAA"...

We can imagine also some new kinds of "tagged constants" recognized by the parser like: 0t12.2'i' == 0t12.2('h'+1) to represent a imaginary number represented by a constant <double> value tagged by a <char> value, this constant having data type "double<'I'>" here, itself a subtype of "double<char>", or 0t0x0a'i' == 0t10'i' which is a constant of type "int<'i'>", itself a subtype of "int<char>" ...).

Another alternative but equivalent syntax for tagged constants would be <'i'>12.2 == <'h'+1>12.2. Untagged constants like 12.2 are equivalent to <0>12.2 (the default tag constant is 0, the default tag type is an int, enumerated by default by a forward iterator starting from 0 with increment 1; this default enumerator is used when enum members are declared without a tag and a new tag is needed because they are not the same base type):

- enum{ a, b, c } is then equivalent to enum:int{<0>a, <0>b, <0>c} (these <0> tags don't need to be stored, they are implicit, not significant)
- enum{ <int>, <double>, <string> } is then equivalent to enum:int{ <0><int>, <1><double>, <2><string> } (these 3 distinct <0>, <1>, <2> tags are needed because we use types, and not constants, as members of the declared enum, even if every <int> instance can be compared as equal to an existing <double>, something that cannot be asserted for all of them, for example when <int> requires a 64-bit value, and <double> also requires 64-bit but not for the same precision and value range, so each <int> member will be stored differently from each <double> member, and a distinct tag value is needed; here also the order of declaration of members in the enum type is significant when value types are different; here also you can have constructors for the enum type, as well as catchers...). Here also the tag value can be a constant expression.

Every native numeric type, every objects like strings or arrays, or structs, classes, pointers, references, functions/methods can also be type members of an enum, and be used as a tag type. All native types can be declared in the language itself (so no more need to reserve keywords like "bool", "char", "short", "int", "long", "float", "double": they can be all declared using a typedef as an enum (or enum + catches), with their minimum, maximum, precision, rounding modes, and other predefined named constants of these types (with these constant names scoped in their defining type)... We have now a fully defined semantics for all arithmetic operations, orderings, comparations. The preprocessor is no longer required at all (except possibly for #include, which may instead be better replaced by "require(package)").


----

Being able to define a type-safe arithmetic for enum types (at least the arithmetic giving the successor, i.e. constant+1), allows defining useful objects, notably iterators (that we could really name "enumerators") over the value range of enum types, which would in turn permit object-oriented constructs like:

for (i: enumerator<enum_t>) { ... }

which won't forget to handle any possible value of an enum type (so won't generate bugs at runtime, like those occuring when using switch statements only with a missing "default:" selector: the compiler would know that the "default:" is missing acording to the type of "i").

And the following declaration is also unsafe, if "a" is assigned by the compiler the integer value 0 (without taking into account the single constraint given to "b" which could instead be used by the compiler to assert that a=-1 and c=1):

enum{a, b=0, c}

These tricks inherited in C# from C and C++ are really bad, these generate unchecked conditions and unexpected bugs with possible overflows, silently generating values that are not part of the defined set.

For now, the only interest of enum types in C/C++/C# is not to restrict the set of values for strict type safety, but just:

- to define constants with possibly scoped names (qualifiable with the typename) and not depending on a preprocessor (whose scoping rules are only global, and severely depend on #included source reading order).

- to use the appropriate integer type (with the minimum bit-size) to store a single enum value.

But I consider that bitfields (using notations like ":1" in C/C++ declarations of structures) are much safer: at least we know precisely their value range, there's no aliases at all, and the arithmetic is precisely defined.

modified 17-Nov-18 17:33pm.

PraiseNice Article Pin
MrFunke3.1416-Nov-18 4:06
MrFunke3.1416-Nov-18 4:06 
GeneralAn educational exercise Pin
SirGrowns15-Nov-18 21:19
professionalSirGrowns15-Nov-18 21:19 
GeneralMy vote of 5 Pin
Donmorcombe15-Nov-18 14:22
Donmorcombe15-Nov-18 14:22 
GeneralSide effect of ToString Pin
Member 812807315-Nov-18 12:20
Member 812807315-Nov-18 12:20 
QuestionDoes it matter Pin
TrendyTim15-Nov-18 12:11
TrendyTim15-Nov-18 12:11 
GeneralMy vote of 5 Pin
dmjm-h15-Nov-18 11:35
dmjm-h15-Nov-18 11:35 
GeneralI would consider an enum like you're proposing to be a 'code smell' Pin
Will Wayne15-Nov-18 11:09
Will Wayne15-Nov-18 11:09 
GeneralRe: I would consider an enum like you're proposing to be a 'code smell' Pin
Philippe Verdy17-Nov-18 6:21
Philippe Verdy17-Nov-18 6:21 
QuestionAnswer in the form of a question... Pin
Member 460345715-Nov-18 10:41
Member 460345715-Nov-18 10:41 
SuggestionNo dupes? Pin
vbjay.net15-Nov-18 9:28
vbjay.net15-Nov-18 9:28 
GeneralRe: No dupes? Pin
Member 1372366918-Nov-18 22:02
Member 1372366918-Nov-18 22:02 
GeneralRe: No dupes? Pin
Heriberto Lugo9-Feb-19 20:46
Heriberto Lugo9-Feb-19 20:46 
GeneralRe: No dupes? Pin
Philippe Verdy1-Mar-19 9:57
Philippe Verdy1-Mar-19 9:57 
QuestionThat is bizarre Pin
Marc Clifton15-Nov-18 0:28
mvaMarc Clifton15-Nov-18 0:28 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.