Sunday, September 8, 2024

Zero cost abstractions are cool

There is a language design principle in C++ that you should only pay for what you use; that is, that the language should not be doing "extra work" which is not needed for what you are trying to do. Often this is extrapolated by developers to imply building "simple" code, and only using relatively primitive data structures, to avoid possible runtime penalties from using more advanced structures and methodologies. However, in many cases, you can get a lot of benefits by using so-called "zero cost abstractions", which are designed to be "free" at runtime (they are not entirely zero cost; I'll cover the nuance in an addendum). These are cool, imho, and I'm going to give an example of why I think so.

Consider a very simple, and somewhat ubiquitous in code from less experienced developers, function result paradigm of returning a boolean to indicate success or failure:

bool DoSomething(); 

This works, obviously, and fairly unambiguously represents the logical result of the operation (by convention, technically). However, it also has some limitations: what if I want to represent more nuanced results, for example, or pass back some indication of why the function failed? These are often trade-offs made for the ubiquity of a standard result type.

Processors pass result codes by way of a register, and the function/result paradigm is ubiquitous enough that this is well-supported by all modern processors. Processor register sizes are dependent on the architecture, but should be at least 32bits for any modern processor (and 64bits or more for almost all new processors). So, when passing back any result code, passing 64bits of data is the same as passing one bit, per the above example, runtime performance wise. So we can rewrite our result paradigm as this, without inducing any additional runtime overhead:

uint64_t DoSomething();

Now we have an issue, though: we have broken the ubiquity of the original example. Is zero success now (as would be more common in C++ for integer result types)? What do other values mean? Does each function define this differently (eg: a custom enum of potential result values per-function)? While we have added flexibility, we have impacted ubiquity, and potentially introduced complexity which might negate any other value gained. This is clearly not an unambiguous win.

However, we can do better. We can, for example, not use a numeric type, but instead define a class type which encapsulates the numeric type (and is the same size, so it can still be passed via a single register). Eg:

class Result
{
    int64_t m_nValue;
    bool isSuccess() const { return m_nValue >= 0; }
    bool isFailure() const { return m_nValue < 0; }
};

Now we can restore ubiquity in usage: callers can use ".IsSuccess()" and/or ".isFailure()" to determine if the result was success or failure, without needing to know the implementation details. Even better: this also removes any lingering ambiguity in the first example as well, as we now how methods which clearly spell out intent in readable language. Also, importantly, this has zero runtime overhead: an optimizing compiler will inline these methods to be assembly equivalent to manual checks.

Result DoSomething();

//...
auto result = DoSomething();
if (result.isFailure())
{
    return result;
}

This paradigm can be extended as well, of course. Now that we have a well-defined type for result values, we could (for example) define some of the bits as holding an indicative value for why an operation failed, and then add inline methods to extract and return those codes. For example, one common paradigm from Microsoft uses the lower 16bits to encapsulate the Win32 error code, where the high bits have information for the error disposition and component area which generated the error. This can also be used to express nuance in success values as well; for example, an operation which "succeeded", but which had no effect, because preconditions were not satisfied.

Moreover, if used fairly ubiquitously, this can be used to easily propagate unexpected error results up a call stack as well, as suggested above. One could, if inclined, add macro-based handling to establish a standard paradigm of checking for and propagating unknown errors, and with the addition of logging in the macro, the code could also generate a call stack on those cases. That compares fairly favorably to typical exception usage, for example, both in utility and runtime overhead.

So, in summary, zero cost abstractions are cool, and hopefully the above has convinced you to consider that standard result paradigms are cool too. I am a fan of both, personally.

Addendum: zero cost at runtime

There is an important qualification to add here: "zero cost" applies to runtime, but not necessarily compile time. Adding structures implies some amount of compilation overhead, and with some paradigms this can be non-trivial (eg: heavy template usage). While the above standard result paradigm is basically free and highly recommended, it's always important to also consider the compilation time overhead, particularly when using templates which may have a large number of instantiations, because there is some small but non-zero compilation time overhead there. The more you know.



No comments: