The Iron Code

triangles

Traits, the holy grail for common behavior?

written on 2022-07-18 18:02

Where this comes from

The duplication problem

When solving problems in any programming language, there comes a point where you encounter the need for shared behavior. Almost every language offers some intrinsic solution to this problem. It gets easier if your language is object oriented, features some inheritance and you own the class. But often we want to work with internal data types or don't own the class ourself. Still we want to have some mechanism to cope common behavior and reduce code duplication.

Solutions from other languages

Dynamically typed languages simply allow you to define a function where the argument is almost anything, see the example python snippet below:

def add(a,b):
    return a+b

def caller():
    c_int = add(1,2)
    c_float = add(1.0, 2.0)
    c_string = add('some', 'string')

Wow... Now that's a generic solution. Statically typed languages can't do that. Fortunately, the C++ guys came up with something called templates. Just write a function with a template-parameter and the compiler will do the rest:

<template typename T>
T add(T a, T b) {
  return a+b;
}

auto const c_int = add(1,2);
auto const c_float = add(1.0, 2.0);

Neat! Almost as compact as the python version and still strictly typed. So we're done, the article is finished, just use C++ and templates or python.

Unfortunate reality

So in recent times, python started to add type annotations to their functions that can even be checked by tests. Someone finally noticed, that maintaining large codebases in python is hell on earth. Due to the fact that all parameters can be everything, details on internal behavior leak into the client code extremely fast (Hyrum's law on steroids). If we wanted to change the implementation of a function and didn't foresee how some user used this function with some weird type that we didn't cover with a test, we will break his code. If he didn't test his client code properly, we will find out in production or he will, doesn't matter - bad outcome.

So we are fine with just using templates and C++? Well yes and no. First let's quickly recap how this template-thingy works. You call the template function with an argument of a certain type. The compiler will try to instanciate a version of the template-code with your argument type and build it. This process is called monomorphization and leads to very performant code, since the result is no different than manually writing each function. With the variadic templates and perfect forwarding introduced in c++ 14, the template magic could now be used for common behavior or otherwise unrelated classes in an even broader way. Imagine a template class which forwards all given c'tor arguments (variadic template arguments) to a templated member - boom: you have created a class that can hold all of your RAII scopes! You can even put them all inside a vector even though they never inherited from a common base.

So far so good, but how does this play together with other features of the language like function overloading and default parameters and is substitution failure an error? (spoiler: it's not, see SFINAE). Well, the answers quickly become complicated and a nice write up can be found in Scott Meyer's book Effective modern c++. One of the keypoints is, that this template system got so out of hand by now, that there's a lot of template metaprogramming to be found in a lot of codebases, that probably shouldn't exist outside of any systems level library (and, to be bold: probably just shouldn't exist at all). Template code is ridiculous to read and understand, maintaining it becomes expensive very fast. Moreover, compile times are going through the roof with it and people just seem to abuse it wherever they can. With c++ 20 we finally get something called concepts where we can now narrow down what kind of classes you can stuff into template-code. Before the c++ mob comes to my door now: This was all possible with std::enable_if constructions beforehand, it was just even more ridicolous. But even with concepts the problem remains, that templates are kind of an odd thing, that doesn't really fit into the language design - almost like a second level preprocessor step. In use, they feel kind of "bolt-on"'... on top of an already existing language that never foresaw the need for this. Finally, even concepts as the designated solution to the harshness of knowing nothing about the incoming type feel that way, since their validity check is: 'does the given snippet compile with type T? If yes, T is of concept bla'. A compiler checking whether a code snipped compiles to group functionality is astonishing. Given the constraint of 'never breaking compatibility' concepts is an improvement, but maybe that's just exactly the problem. One last remark: Did I even mention, that in c++ we are only talking about "the code compiles`. There's no guarantee that the function we are calling in the template code does what we expect it to do for the class someone provided.

How does rust do it?

With no compatibility constraints, rust startet over from scratch - cool, but how does it look? What they came up with really blew my mind. There's structs which can have methods - so far so good. But then there's these things called traits. They can be used to define common behavior. The fact that the trait has to be given a name, also gives it semantic meaning.

Traits as interfaces

A trait comprises methods. When implementating a trait for a struct, you have to implement them for your struct. At first glance, this may look like a version of inheritance (and it looked to me that way) and yes: it is solving the same problem here. A nice example are the data models in sixty-fps, a cool gui framework or also the iterators in std rust itself. But there's a lot more to traits than just that.

Traits as adapters

The first thing is, that you can implement a trait if you 'own' the struct or the trait (the orphan rule). This means that you can enhance external structs with your own 'adapter-like' traits to make them fit into your client code. This removes the need for the hacky c++-template-common-behavior-wrappers.

But wait there's even more: what are interfaces in rust anyway?

Traits as interfaces revisited

Let's go back to rust interfaces for one more time. So, here we also - like in c++ - have two possibilities for shared behavior interfaces in rust. A monomorphized one in form of generics and another one using objects on the heap in Boxes (which are basically similar to std::unique_ptr). Let's check the latter first, cause it's close to what C++ inheritance does. We could implement a trait called Animal which forces us to implement a method called fn num_legs(&self) -> usize;. If we implement this for the structs Cat and Dog we could easily have a vector like Vec<Box<dyn Animal>> to hold entries of Cat or Dog or whatever other struct implements our trait. For example:

struct Cat;
struct Dog;
impl Animal for Cat {...}
impl Animal for Dog {...}
//...
let container: Vec<Box<dyn Animal>> = vec![Box::new(Cat{}), Box::new(Dog{})];

Some constraints on the trait have to be applied for this pattern to be allowed: It has to be 'object safe' - if you search the rust documentation for this keyword you will find the exact constraints. Pretty neat!

In most scenarios you can also get away with using generics, rusts version of templates. Sample code to process an Animal could look like (there's a few other ways to write this):

pub fn process_animal(a: impl Animal) -> usize {
  a.num_legs()
}

This will edge out the maximum performance, since its monomorphized too, but it's constrained by what specific aspect is needed in the given function. Sidenode: This leads to working auto-completion in generic rust code. If you miss a function on your generic parameter, you probably miss a trait-requirement.

Now the interesting part is: The Crab-Overlords used the same concept for templates/generics as for vtable based common behavior abstraction. For a c++ guy like me it looks like heaven and heresy at the same time! And, let's come back to the semantic meaning of trait names: It is dead easy to get right the use-cases, since I have to supply something that is an Animal or close enough to emulate one. Moreover, it's nearly impossible to miss-use by chance, you'd have to intentionally abuse it. Finally, this concept is compact while being explicit and it produces a strong contract in all cases about what methods are guaranteed to be available.

Are traits the holy grail for common behavior? Well they are certainly good, but I can't just call them that now. I have too little insight in Golang's design choices and would love to evaluate that more. For now - when comparing it to c++, traits are for me the superior choice by far.

Thanks for reading! Please let me know about feedback or mistakes :)

Tagged:
General
Programming