Author Topic: Enum Objects  (Read 4195 times)

daniel.santos

  • Guest
Enum Objects
« on: 27 September 2009, 23:58:38 »
This is a continuation of a discussion between silnarm & myself from IRC.  I don't want to spend long because I want to finish up the merge aftermath. =)

First, I believe it may be possible to have Enum objects that work just like classic enums using template meta-programming.  I'm not skilled enough at this point to know for certain or to implement it.  Also, I don't know the status of MSVC++ 10 (i.e., 2008 Express) on how well TMP is supported and using TMP will restrict the number of compilers that can compile GAE.  The way I think this would work, is to add an operator int() that returns a value generated via TMP, so that it is available at compile-time.  This should allow it to be used in switch statements.  But I think there's a lot of power in switch statements and the ability for the compiler to warn about not having cases for all values, so I'm not certain I want to proceed all the way to object Enums without that.

Either way, we can leverage what I've already written to create base class for all such enums that will be able to:
  • Return a string representation of any enum value
  • Convert a string to an enum value (we do this a lot for reading in XML).

silnarm

  • Local Moderator
  • Behemoth
  • ********
  • Posts: 1,373
    • View Profile
Re: Enum Objects
« Reply #1 on: 29 September 2009, 23:20:05 »
operator int () !!! Brilliant!

Looks like we wont need to do any TMP, I tried out this:
Code: [Select]
// wrapped enum
#define WRAP_ENUM( Name, ... ) \
struct Name { \
enum Enum { __VA_ARGS__ }; \
Name( Enum val ) : value( val ) {} \
operator int() { return value; } \
private: \
Enum value; \
};

and it compiles on VC and it gives us just what we want!  ;D
I see no reason why it shouldn't work with gcc, if so... I think we can move on to lumping it in with STRINGY_ENUM  :o

Edit:
I should clarify what I'm after, in particular the above is not type-safe, but that doesn't bother me personally.
Also, it doesn't give a warning for missing cases, personally this doesn't particularly bother me either, as if the intent is to cover all cases, then you should default: throw runtime_error("Class::Func() passed unknown value");
The things I was going for:
1. Forced Scoping.  So that the names are appropriately scoped and we can drop the prefix on values.
2. Otherwise behave just like a regular enum.
3. Be as lightweight as possible, so that all the excess code should get optimised out of existence.
« Last Edit: 30 September 2009, 00:07:22 by silnarm »
Glest Advanced Engine - Code Monkey

Timeline | Downloads

silnarm

  • Local Moderator
  • Behemoth
  • ********
  • Posts: 1,373
    • View Profile
Re: Enum Objects
« Reply #2 on: 12 October 2009, 03:38:45 »
Ok... I've been busy.

I've integrated the (modified) WRAP_ENUM and STRINGY_ENUM.

We have type safety, and we can match a string and get an enum back :)

Code: [Select]
#define ENUMERATED_TYPE(Name,...)                       \
    struct Name {                                       \
        enum Enum { __VA_ARGS__, COUNT };               \
        Name(Enum val) : value(val) {}                  \
        operator Enum() { return value; }               \
    private:                                            \
        Enum value;                                     \
    };                                                  \
    STRINGY_ENUM_NAMES(Name, Name::COUNT, __VA_ARGS__);

The only difference to the WRAP_ENUMM is no int conversion operator, I'm only supplying a conversion operator to the actual enum (Enum).  That gives us type safety, and Enum is public anyway, so if you really need to cast to/from an int you can still do so, with Name::Enum.

In contrast to the STRINGY_ENUM macro, it doesn't take the 'count' value, now that all enum values are scoped, they will all just be COUNT, so this is added automagically.

The STRINGY_ENUM_NAMES macro has been modified a little, it takes the enum name as parameter now, and mangles it itself, I did this so I could pass the enum name on to the EnumNames constructor to get the 'namespace' printed with the values (optional).

Code: [Select]
#ifdef CONSTANTS_DEF
# define STRINGY_ENUM_NAMES(name, count, ...) EnumNames name##Names(#__VA_ARGS__, count, true, false, #name)
#else
# define STRINGY_ENUM_NAMES(name, count, ...) extern EnumNames name##Names
#endif

The EnumNames class I have only tweaked a bit, getName() was renamed operator[], and I added some bits for the optional value prefixes. And the coup de gras, match<>().  It's a little bit clumsy to use, but making it act 'properly' would mean templating the class, then we get a class for every enum... This way we only template the function, so we only get a proliferation of templated functions, and only for those that are actually used for matching strings from XML.
So it looks like this,
Fruit myFruit = FruitNames.match<Fruit>(myString);

The version below is still a bit messy, the scoped strings could just replace the valueList and names, I'll clean it up a bit more before I commit.

Here is the compete test project.
Code: [Select]
// constants.h
#pragma once

#include <string>
using namespace std;

#define ENUMERATED_TYPE(Name,...)                       \
    struct Name {                                       \
        enum Enum { __VA_ARGS__, COUNT };               \
        Name(Enum val) : value(val) {}                  \
        operator Enum() { return value; }               \
    private:                                            \
        Enum value;                                     \
    };                                                  \
    STRINGY_ENUM_NAMES(Name, Name::COUNT, __VA_ARGS__);

// =====================================================
// class EnumNames
// =====================================================

/** A utility class to provide a text description for enum values. */
class EnumNames {
#ifdef NO_ENUM_NAMES
public:
    EnumNames(const char *valueList, size_t count, bool copyStrings, bool lazy) {}
    const char *operator[](int i) {return "";}
template<typename EnumType> EnumType match(string &value) { return EnumType::COUNT; }
#else
private:
const char *valueList;
const char **names;
const char *qualifiedList;
const char **qualifiedNames;
size_t count;
bool copyStrings;

public:
/** Primary ctor. As it turns out, not specifying copyStrings as true on most modern system will
* result in some form of access violation (due to attempting to write to read-only memory. */
EnumNames(const char *valueList, size_t count, bool copyStrings, bool lazy, const char *enumName = NULL);
    ~EnumNames();
    const char *operator[](unsigned i) const {
if(!names) {
const_cast<EnumNames*>(this)->init();
}
return i < count ? qualifiedNames ? qualifiedNames[i] : names[i] : "invalid value";
}
/** Matches a string value to an enum value  */
template<typename EnumType>
EnumType match(const char *value) const {
if ( !names ) {
const_cast<EnumNames*>(this)->init();
}
for ( unsigned int i=0; i < count; ++i ) {
const char *ptr1 = names[i];
const char *ptr2 = value;
bool same = true;
if ( !*ptr1 || !*ptr2 ) {
continue;
}
while ( *ptr1 && *ptr2 ) {
if ( isalpha(*ptr1) && isalpha(*ptr2) ) {
if ( tolower(*ptr1) != tolower(*ptr2) ) {
same = false;
break;
}
} else if ( ! ( *ptr1 == '_' && (*ptr2 == ' ' || *ptr2 == '_') ) ) {
same = false;
break;
}
ptr1++;
ptr2++;
if ( ( !*ptr1 && *ptr2 && !isspace(*ptr2) ) || ( *ptr1 && !*ptr2 ) ) {
same = false;
}
}
if ( same ) {
return (EnumType::Enum)i;
}
}
return EnumType::COUNT;
}

private:
void init();
#endif
};

#ifdef CONSTANTS_DEF
# define STRINGY_ENUM_NAMES(name, count, ...) EnumNames name##Names(#__VA_ARGS__, count, true, false, #name)
#else
# define STRINGY_ENUM_NAMES(name, count, ...) extern EnumNames name##Names
#endif

ENUMERATED_TYPE( Vegetable, CARROT, BEAN, PEA, CAPSICUM, RADISH, CAULIFLOWER, BROCCOLI )
ENUMERATED_TYPE( Fruit, APPLE, BANANA, PEAR, PEACH, APRICOT, PLUM, PINE_APPLE )

Code: [Select]
// constants.cpp

#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <ctype.h>
#include <assert.h>

#define CONSTANTS_DEF
#include "constants.h"

// =====================================================
// class EnumNames
// =====================================================

#ifndef NO_ENUM_NAMES
EnumNames::EnumNames(const char *valueList, size_t count, bool copyStrings, bool lazy, const char *enumName)
: valueList(valueList)
, qualifiedList(NULL)
, qualifiedNames(NULL)
, names(NULL)
, count(count)
, copyStrings(copyStrings) {
if(!lazy) {
init();
}
if ( enumName ) {
if ( lazy ) {
throw runtime_error("Qualified names and Lazy loading not simultaneously supported.");
}
qualifiedList = new const char[(strlen(enumName) + 2) * count + strlen(valueList) + 1];
qualifiedNames = new const char*[count];
char *tmp = strcpy(new char[strlen(valueList) + 1], valueList);
char *tok = strtok(tmp,",");
char *ptr = const_cast<char*>(qualifiedList);
int tokens = 0;
while ( tok ) {
qualifiedNames[tokens] = ptr;
tokens++;
while ( isspace(*tok) ) tok++;
ptr += sprintf(ptr, "%s::%s", enumName, tok);
*ptr++ = 0;
tok = strtok(NULL, ",");
}
delete [] tmp;
}
}

EnumNames::~EnumNames() {
if(names) {
delete[] names;
if(copyStrings) {
delete[] valueList;
}
}
if (qualifiedNames ) {
delete [] qualifiedNames;
delete [] qualifiedList;
}
}

void EnumNames::init() {
size_t curName = 0;
bool needStart = true;

assert(!names);
names = new const char *[count];
if(copyStrings) {
valueList = strcpy(new char[strlen(valueList) + 1], valueList);
}

for(char *p = const_cast<char*>(valueList); *p; ++p) {
if(isspace(*p)) { // I don't want to have any leading or trailing whitespace
*p = 0;
} else if(needStart) {
// do some basic sanity checking, even though the compiler should catch any such errors
assert(isalpha(*p) || *p == '_');
assert(curName < count);
names[curName++] = p;
needStart = false;
} else if(*p == ',') {
assert(!needStart);
needStart = true;
*p = 0;
}
    }
    assert(curName == count);
}
#endif

Code: [Select]
// main.cpp

#include <stdlib.h>
#include <time.h>

#include <iostream>

#include "constants.h"

Fruit getFruit() {
return (Fruit::Enum)(rand() % Fruit::COUNT);
}

void main( void ) {
const int numFruit = 5;
srand( (unsigned int)time( NULL ) );
int apples, bananas, pears, others;
apples = bananas = pears = others = 0;
for ( int i = numFruit; i; --i ) {
Fruit f = getFruit();
switch ( f ) {
case Fruit::APPLE: apples++; break;
case Fruit::BANANA: bananas++; break;
case Fruit::PEAR: pears++; break;
default: others++;
}
printf("Got a %s\n", FruitNames[f] );
}
printf( "Went to the market, got %d pretend apples, %d bananas, %d pears and %d other items\n",
apples, bananas, pears, others );
while ( true ) {
printf("\nInput fruit: ");
static char buffer[256];
cin.getline (buffer,256);

Fruit myFruit = FruitNames.match<Fruit>(buffer);
Vegetable myVegetable = VegetableNames.match<Vegetable>(buffer);

if ( myFruit != Fruit::COUNT ) {
printf("You selected %s\n", FruitNames[myFruit] );
}
else {
if ( myVegetable == Vegetable::COUNT ) {
printf("Your fruit could not be matched");
} else {
printf("That's no fruit, matched %s\n", VegetableNames[myVegetable] );
}
}
}
}

And some output :)
Code: [Select]
Got a Fruit::PEACH
Got a Fruit::APPLE
Got a Fruit::PINE_APPLE
Got a Fruit::BANANA
Got a Fruit::APPLE
Went to the market, got 2 pretend apples, 1 bananas, 0 pears and 2 other items

Input fruit: apple
You selected Fruit::APPLE

Input fruit: pine APPle
You selected Fruit::PINE_APPLE

Input fruit: broccoli
That's no fruit, matched Vegetable::BROCCOLI

Glest Advanced Engine - Code Monkey

Timeline | Downloads

silnarm

  • Local Moderator
  • Behemoth
  • ********
  • Posts: 1,373
    • View Profile
Re: Enum Objects
« Reply #3 on: 12 October 2009, 15:56:54 »
Excuse my triple post...

Slight amendment, we also need a const conversion operator, to keep all users happy,
operator Enum() const { return value; }

Finishing up. XmlBasedFlags::load()

So to finish up, we just need to modify XmlBasedFlags to use the EnumNames objects. Instead of load() taking as final parameter an array of character arrays, it accepts a reference to an EnumNames. Then the guts of the load method is replaced with this,
Code: [Select]
if (nodeName == childNodeName) {
flagName = node->getAttribute("value")->getRestrictedValue();
} else {
flagName = nodeName;
}
E result = flagNames.match<E>(flagName.c_str());
if ( result == E::COUNT ) {
throw runtime_error(string() + "Invalid " + childNodeName + ": " + flagName + ": " + dir);
} else {
this->set(result, true);
}

Applied and working in the path_finder branch  ;D

Glest Advanced Engine - Code Monkey

Timeline | Downloads

daniel.santos

  • Guest
Re: Enum Objects
« Reply #4 on: 13 October 2009, 21:28:20 »
Hell yes!!  I haven't read through this whole thing yet because I have my brain occupied with other stuff right now, but I'll probably be cherry-picking this into the network branch in the new few days.  Very awesome! :)

daniel.santos

  • Guest
Re: Enum Objects
« Reply #5 on: 15 October 2009, 09:02:30 »
OK, I've gotten a chance to analyze this closely, brilliant! :)  I like the implicit cast from the class (struct, whatever, as I'm sure you know, in C++ a struct is just a class who's members default to public) type to the enum type (the operator Enum()) and then the implicit cast from the enum type back to the class with the constructor.  This provides type safety and switch-compatibility, while preventing accidental type conversions (because an int type wont implicitly convert to an enum type, but it will go the other way).

You might want to remove copyStrings from EnumNames (and make it always copy).  A long time ago, I frequently took the liberty of changing data you aren't supposed to, like:
Code: [Select]
char s = "jack";
s[3] = 0;
This was probably on DOS, and win3x/9x, which let me do it (maybe even NT 3.x and 4).  It seems that all modern operating systems puke on you if you try this now, so there's no sense in having it as an option.

Finally, once this is further refined & tweaked, it may be worthy of a write-up or even presentation to the boost project or some such (I hope I'm not in fantasy land here :) ).

silnarm

  • Local Moderator
  • Behemoth
  • ********
  • Posts: 1,373
    • View Profile
Re: Enum Objects
« Reply #6 on: 16 October 2009, 03:45:44 »
Finally, once this is further refined & tweaked, it may be worthy of a write-up or even presentation to the boost project or some such (I hope I'm not in fantasy land here :) ).

Not a bad idea, we might want to check out the competition first ;) I did google about a bit before deciding this was a worthy pursuit, and didn't like much of what I found, but I haven't checked the Boost vault.

I've made some further refinements, a default constructor was lacking, I'm initialising to COUNT, but I've kind of given COUNT a double meaning by using it as an 'invalid' marker, perhaps we should automagically add INVALID=-1 as well?

The casting problem is not a problem anymore, I just needed to think in a more C++ way, just a few lines of code did the trick,
Code: [Select]
template<typename E>
E enum_cast(unsigned i) {
return i < E::COUNT ? (E::Enum)i : E::COUNT;
}

Which maintains type safety and allows for unsafe conversions in a safe manner :)

Code: [Select]
Fruit f;
//f = Vegetable::PEA; // error, incompatible types
f = enum_cast<Fruit>(Vegetable::PEA); // ok, f == Fruit::PEAR
printf("Fruit f = %s\n", FruitNames[f]);

//f = 3; // error, incompatible types
f = enum_cast<Fruit>(3); // Ok, f == Fruit::PEACH
printf("Fruit f = %s\n", FruitNames[f]);

f = enum_cast<Fruit>(7); // Ok, but f == Fruit::COUNT because 7 >= COUNT

I also decided to explicitly have or not have the EnumNames attached, the 'wrapped' enum I'm calling WRAPPED_ENUM (for now, atleast :) ), which can be used if you don't want string support, and I've revived STRINGY_ENUM as,
Code: [Select]
#define STRINGY_ENUM(Name,...) \
WRAPPED_ENUM(Name,__VA_ARGS__) \
STRINGY_ENUM_NAMES(Name, Name::COUNT, __VA_ARGS__);

EnumNames is now templated on the enum object, so the usage is much nicer, Fruit f = FruitNames.match("pear");  We're not using many for matching flags from XML, so the amount that need to be STRINGY isn't great, and for debugging, we can just make all WRAPPED enums STRINGY as well.

So, a couple of outstanding issues I can think of:
1. add INVALID to all enums ?
2. capitalisation and spacing policy for displayed names ?

NB: haven't committed any changes yet, I have the NodePool object in pieces, hopefully will have it back together soon, and will commit then :)
Glest Advanced Engine - Code Monkey

Timeline | Downloads

daniel.santos

  • Guest
Re: Enum Objects
« Reply #7 on: 19 October 2009, 20:33:36 »
I've made some further refinements, a default constructor was lacking, I'm initialising to COUNT, but I've kind of given COUNT a double meaning by using it as an 'invalid' marker, perhaps we should automagically add INVALID=-1 as well?
Not a bad idea.

Code: [Select]
template<typename E>
E enum_cast(unsigned i) {
return i < E::COUNT ? (E::Enum)i : E::COUNT;
}
gcc didn't like this because it couldn't tell if E::Enum was a type name or a symbol.  It works like this:
Code: [Select]
template<typename E>
E enum_cast(unsigned i) {
    return i < E::COUNT ? static_cast<typename E::Enum>(i) : E::COUNT;
}

I also decided to explicitly have or not have the EnumNames attached, the 'wrapped' enum I'm calling WRAPPED_ENUM (for now, atleast :) ), which can be used if you don't want string support, and I've revived STRINGY_ENUM as,
Code: [Select]
#define STRINGY_ENUM(Name,...) \
WRAPPED_ENUM(Name,__VA_ARGS__) \
STRINGY_ENUM_NAMES(Name, Name::COUNT, __VA_ARGS__);
Good idea! :)

EnumNames is now templated on the enum object, so the usage is much nicer, Fruit f = FruitNames.match("pear");  We're not using many for matching flags from XML, so the amount that need to be STRINGY isn't great, and for debugging, we can just make all WRAPPED enums STRINGY as well.
Yea, but now we'll generate a new copy of the code for each type, and that's bad.  The way to pretty-ify this is to do everything in unsigned int in EnumNames and then have a derived templatized class that does the typing:
Code: [Select]
class EnumNamesBase {
private:
    const char *valueList;
    const char **names;
    const char *qualifiedList;
    const char **qualifiedNames;
    size_t count;

public:
    EnumNamesBase(const char *valueList, size_t count, bool lazy, const char *enumName = NULL); // implemented in .cpp file
    ~EnumNamesBase(); // implemented in .cpp file

protected:
    const char *get(unsigned int i, size_t count) const { // using count as local instead of data member to compile out memory access (as long as a constant is passed for count)
        if(!names) {
            const_cast<EnumNames*>(this)->init();
        }
        return i < count ? qualifiedNames ? qualifiedNames[i] : names[i] : "invalid value";
    }

    unsigned int _match(const char *value) const; // larger function, so implemented in .cpp file

private:
    init(); // implemented in .cpp file
};

template<typename E>
class EnumNames : public EnumNamesBase {
public:
    EnumNames(const char *valueList, size_t count, bool lazy, const char *enumName = NULL) : EnumNames(valueList, count, lazy, enumName) {}
    ~EnumNames() {}

    const char *operator[](E e) const {return get(e, E::COUNT);} // passing E::COUNT here will inline the value in EnumNamesBase::get()
    E match(const char *value) const {return static_cast<E>(_match(value));} // this will inline a function call to the fairly large _match() function
};

So, a couple of outstanding issues I can think of:
1. add INVALID to all enums ?
2. capitalisation and spacing policy for displayed names ?

Will have to get back with you on the rest of these later, but gcc isn't doing the implicit conversion using the operator Enum() function for some reason. :(

daniel.santos

  • Guest
Re: Enum Objects
« Reply #8 on: 22 October 2009, 21:06:34 »
Follow up on these two items:
1. add INVALID to all enums ?
2. capitalisation and spacing policy for displayed names ?

1. Yes, I think an INVALID value on all enums would be good.  Something like static_cast<unsigned int>(-1)?  By the way, when I 1st saw Martiño's code, I thought all of this "static_cast" crap was a bunch of verbose source bloat.  I've since learned that the C++ cast operators have a lot of advantages over the C-style cast operators, mainly safety-related.  So it's probably best practice to use them instead.

2. Hmm,  I think that in general UPPER_CASE_UNDERSCORE_SEPARATOR is best practice.  But in XML files, the standard seems to be lower-case-hyphenated.  So maybe we can have a function to match each?  Actually, I saw that your function will manage both, so I'm thinking that's good enough.

PS: Wait, are C/C++ enums unsigned int by default?

hailstone

  • Local Moderator
  • Battle Machine
  • ********
  • Posts: 1,568
    • View Profile
Re: Enum Objects
« Reply #9 on: 22 October 2009, 22:01:27 »
According to Bjarne Stroustrup, "the range of enum is [0:2^k-1] where 2^k is the smallest power of two for which all enumerators are within the range. If there are negative enums, the range is [-2^k:2^k-1]. This defines the smalles bit-field capable of holding the enumerator values using the conventional two's complement representation". (The C++Programming Language 3rd ed.)
Glest Advanced Engine - Admin/Programmer
https://sourceforge.net/projects/glestae/

silnarm

  • Local Moderator
  • Behemoth
  • ********
  • Posts: 1,373
    • View Profile
Re: Enum Objects
« Reply #10 on: 30 October 2009, 23:05:19 »
gcc didn't like this because it couldn't tell if E::Enum was a type name or a symbol.  It works like this:
Code: [Select]
template<typename E>
E enum_cast(unsigned i) {
    return i < E::COUNT ? static_cast<typename E::Enum>(i) : E::COUNT;
}

Wierd, it shouldn't have been trying to compile the templated function until it was instantiated with a type, at which point the meaning should be obvious... but anyway, as long as it works :)

Quote from: daniel.santos
Yea, but now we'll generate a new copy of the code for each type, and that's bad.  The way to pretty-ify this is to do everything in unsigned int in EnumNames and then have a derived templatized class that does the typing:

Much better! Nice one, I was somewhat concerned over that, but figured by only making those that needed to be 'stringy' have an EnumNames class it wouldn't be that bad.  This is much nicer.

PS: Wait, are C/C++ enums unsigned int by default?

Good question, but I don't think static_cast<unsigned int>(-1) makes much sense anyway ;)  Unless you wanted to use 4294967295 as the invalid marker (2^32-1) [assuming 32 bit ints].  Let's just change all out unsigned ints into signed ones :)

Arrays are indexed lists from 0 to n. You are not supposed to run over an array using for..in. If you want a named array/hash/cake use a plain object.

Thanks for the bump spambot!
Glest Advanced Engine - Code Monkey

Timeline | Downloads