Extending C++ Enums

published: 30 November 2018 • tags:

This article is a bit of an experiment. It describes my thought process while developing a small C++ feature, including failed attempts and all. Of course it’s cleaned up somewhat. The unedited reality would be a bit too messy to make a coherent article.

So … Let’s talk about enums in C++. They are a helpful, easy to use little feature in the language, especially since enum classes added more type safety and scoped names back in C++11. For the rest of the article I’ll use the following enum as an example:

enum class Season : int
{
    spring = 0,
    summer = 1,
    autumn = 2,
    winter = 3,
};

Most of the time I’m perfectly fine with what enums can do – until I need to serialize them or display a human readable string in a UI that corresponds to a certain enumerator. It’s painful because you can only query one piece of information about an enum at compile time: its underlying integral type. The number of enumerators? Their order? Their values? Nope. None of that.

The Sad State of the Art

A common solution is to encode some metadata into the enum itself. For example:

enum class Season : int
{
    min = 0,
    spring = min,
    summer,
    autumn,
    winter,
    max = winter,
    count = max + 1,
};

Now you can, for example, create arrays with count entries and at least somewhat rely on the compiler to warn about inconsistencies. Maintainability improves from nightmarish to somewhat okay. You are restricted to zero-based, consecutive enumerator values, but you have to be really sloppy to mess up when adding or deleting enumerators. While that’s certainly a good thing, lumping in those min, max and count things with the enumerators is a violation of the type system. It’s wrong and creates problems of its own.

Essentially, an enum is a set of distinct constant values that share a semantic meaning. count et al. are not part of this set. In particular, they do not share the semantic meaning – in our example being a time of the year. Rather they are metadata that describes an enum type, and that’s how they should be modelled.

You might say: Alright, that may be true from a type system lawyer’s perspective. But if we’re a bit more pragmatic, modelling metadata as enumerators doesn’t really hurt, does it? Yes, it does. Consider this function:

Temperature calc_average_temp(Season s);

It’s true that min and max don’t hurt since they’re equivalent to real enumerators. Not so for count. And suddenly a function that should have been able to rely on every input value being valid, can’t. It’s not a brutal disadvantage, but still a potential source of errors. Consider the code bases you’re familiar with. How many functions do they have that take enum arguments? How many return enums? All of them are opportunities for screwing up.

I for one would very much like to avoid the risk altogether. So let’s do that. First we must pull the metadata out of the enum. Then we can add conversion functions. Before we start exploring such an API, I have to clarify one more thing.

This is not going to be code you painstakingly write and maintain for every enum yourself. You have a code generator that does it for you. Or you live with a few restrictions and have some fun with the preprocessor and variadic macros. My concern in this article is not how this code gets generated, only that it is generated. That way some duplication and interdependencies don’t lead to a maintenance nightmare.

With that out of the way, let’s design an API!

First Attempts

What I want to answer are questions like: How many enumerators does this enum have? Are their values consecutive? Can I use the enumerators as flags in a bitfield? Also, as examples of a serialization implementation I want to convert to and from string and the underlying integral type as well as iterate over the enumerators.

Keep it simple they say, and what could be simpler than defining the enum in its own header and adding a bunch of free-standing constants and functions.

// season_enum.hpp
enum class Season ...;

constexpr const bool is_bitflags_enum = false;
constexpr const std::size_t enumerator_count = 4;

std::optional<Season> from_string(const std::string& str) noexcept;

// An array is a simple implementation of the iteration requirement.
constexpr const std::array<Season, enumerator_count> enumerators{
        Season::spring, Season::summer, Season::autumn, Season::winter};

// Rest of the API follows here ...

Most likely you already spotted the problem. As soon as you need two of those extended enums in a single translation unit, you get name clashes. You could simply prefix the names:

// ...
constexpr std::size_t season_enumerator_count = 4;
std::optional<Season> season_from_string(const std::string& str) noexcept;
// ...

But that’s verbose, inflexible and looks too much like a C API. We need a better way to scope those functions and constants.

Simple Solution: Extensions in a Namespace

Scoping names is easy in C++, right? That’s what namespaces are for. Obviously the namespace must have a different name than the enum. The best name is mostly a question of your stylistic preference, and your options are heavily influenced by your overall naming scheme. Long story short, I settled on calling the namespace SeasonEnum. Here is what the complete API looks like:

// season_enum.hpp
enum class Season ...;

namespace SeasonEnum {
    using UnderlyingType = std::underlying_type_t<Season>;

    constexpr const std::size_t enumerator_count = 4;
    constexpr const bool is_bitflags_enum = false;
    constexpr const bool has_consecutive_enumerators = true;
    constexpr const bool starts_at_zero = true;

    UnderlyingType to_underlying_type(Season s) noexcept;
    std::optional<Season> from_underlying_type(UnderlyingType i) noexcept;
    Season from_underlying_type(UnderlyingType i, Season fallback) noexcept;
    bool is_convertible_from(UnderlyingType i) noexcept;

    std::string to_string(Season s);
    std::optional<Season> from_string(const std::string& str) noexcept;
    Season from_string(const std::string& str, Season fallback) noexcept;
    bool is_convertible_from(const std::string& str) noexcept;

    constexpr const std::array<Season, enumerator_count> enumerators{
            Season::spring, Season::summer, Season::autumn, Season::winter};
}

No more name clashes, and using SeasonEnum::whatever; can reduce verboseness where needed. As long as the extensions are intended to be called from other parts of the code manually, I call this solution sufficient. Why complicate things without a good reason?

Fancy Solution: Templated Extension Structs

One reason why the namespaced API from above might not be suitable is template metaprogramming. From the compiler’s point of view enum class Season and namespace SeasonEnum have no connection whatsoever. If you know the enum type the template system doesn’t provide a way to find the matching namespace.

I asked myself: How do you select a bunch of functionality based on a type? For classes that’s specialisation of class templates. And for functions or function templates it’s overloading. In any case we need to get rid of the namespace.

If at first you don’t succeed …

My first idea was to do it like the STL does, i.e. lose the namespace or use a single enum_ext namespace, make everything templates and specialize or overload. But this leads to an unfortunate inconsistency. Consider the following two function templates:

template<typename EnumType>
std::optional<EnumType> from_string(const std::string& str) { ... }

template<typename EnumType>
EnumType from_string(const std::string& str, EnumType fallback) { ... }

For the first one the compiler cannot deduce the template parameter. It must be specified at the call site. For the second one the template parameter can be deduced. Accordingly you call these functions differently despite their almost identical purpose:

auto s1 = from_string<Season>("summer");
auto s2 = from_string("summer", Season::winter);

This alone wouldn’t be a show stopper. But especially the first function reads quite awkwardly. After all, we want to construct a season from a string. Having the Season there in the middle doesn’t flow very well. What about string_to_enum<Season>("summer")? Hm, not really … just a different kind of awkward. Now the string input is in a weird place. Remember the namespaced solution where we’d write SeasonEnum::from_string("summer"). Doesn’t that have a much nicer flow? All in all I wasn’t convinced by this attempt and decided to abandon it.

… try, try again!

My next idea was a templated wrapper struct as a namespace substitute. It contains all the constants and functions as static members. Such a struct can be specialized for each enum type. Using it would look almost like the namespaced solution, e.g.: EnumExt<Season>::from_string("summer").

The basic structure is as follows:

// enum_extensions.hpp

// base template, necessary for specialisation
// never used directly: can be a declaration only
template<typename>
struct EnumExt;

namespace enum_ext
{
    template<typename Derived, typename EnumType>
    struct Conversion
    {
        // Everything is static. Instantiating doesn’t make sense, so disable it.
        Conversion() = delete;
        ~Conversion() = delete;

        static UnderlyingType to_underlying_type(EnumType e) noexcept;

        // ...
    };

    struct IterationTag {};
}

// season_enum.hpp
enum class Season ...;

template<>
struct EnumExt<Season>
    : enum_ext::Conversion<EnumExt<Season>, Season>
    , enum_ext::IterationTag
{
    using EnumType = Season;
    using UnderlyingType = std::underlying_type_t<Season>;

    static std::string to_string(Season s);

    // ...
};

// an alias for convenience
using SeasonEnum = EnumExt<Season>;

To be able to specialize, a base template is required. That’s the EnumExt struct declaration at the beginning. We cannot put the common functionality into it because each specialization completely replaces the base template. Instead we use enum_ext::Conversion as a CRTP base class. If you’re not familiar with that acronym, read up on the curiously recurring template pattern. EnumExt<Season> specializes the base template for our concrete enum type and pulls in the common functionality through inheritance. Note that everything happens at compile time. There are no abstract base classes, no runtime polymorphism and no runtime cost.

Also note that the Conversion base class has two template parameters. Why do we have to pass in the enum type on its own? EnumExt<Season> already specifies it as its own template parameter; and it defines the EnumType alias. I can’t exactly explain it (it got an entry on my to research list), but here’s my gut feeling: C++ doesn’t provide a way to access template parameters from the outside of a template. That’s why most class templates make their template parameters available with a using. But at the point in the parsing process where I’d want to access EnumExt<Season>::EnumType from Conversion, it’s not a complete type yet. Most importantly, EnumType isn’t yet defined. The only solution seems to be to pass the enum type as well. Since this is supposed to be generated code anyway, I won’t lose any sleep over the duplication.

The last interesting detail is the enum_ext::IterationTag base class. It indicates the presence of the EnumExt<Season>::enumerators member and does not do anything on its own. The issue here is that you cannot check for the existence of a certain struct member, but you can check for a base class with std::is_base_of. This makes it possible to split different pieces of functionality into different base classes and query the supported operations for a particular enum at compile time.

I’m not a metaprogramming expert and I didn’t want to get lost in too many details. This attempt does a good enough job as the fancy solution. Actually it quite nicely combines the clarity of the simple namespaced approach with metaprogramming compatibility. With an alias even the old SeasonEnum:: prefix is back. I see only one disadvantage: using SeasonEnum::whatever is not possible any more because it would require a namespace instead of a struct.

Conclusion

We’ve reached the end of this experiment that stretched over about a week in the real world. I had a lot of fun mulling over a C++ feature without any urgency. Consciously keeping track of how the two solutions developed was an intriguing experience as well. You should all try it some time! Now it’s going to be interesting to come back to this article in half a year and see how it holds up.

And finally, even though the details of the solutions were not the main focus, I put together a small project with a full implementation of both the namespaced and the templated solution:

Download enum_extensions.zip