OOPs, I'm so dead!
Okay, okay, I know what you're thinking: what's this data-oriented design zealot doing, opening their mouth about object-oriented programming? Don't they know that OOP is alive, well, and keeps spawning more successful projects than any other programming paradigm ever did, like Shub-Niggurath the Black Goat of the Woods with a Thousand Young spawns progeny? And don't they know that basically everything in programming is actually OOP, including their favourite ML and DOD features?
So, let's start off with some clarifications:
- I'm not announcing the death of object-oriented programming, I just liked the title. Yes, it's effectively click-bait and I don't apologise because I like it and because it is an explicit reference.
- I don't think object-oriented programming is the worst thing in the world, and I use things like classes regularly in the TypeScript I write as a means of putting food on the table.
- I am about to complain about one particular programming language feature that, according to the Internet commentariat, is purely an implementation detail and has nothing to do with object-oriented programming. So, if you see yourself or your favourite programming language in the picture being drawn by this blog post, remember then that this is purely a coincidence and no object-oriented languages are being harmed by this blog post – only implementation details.
The thing we're talking about is of course structural inheritance. "What is structural inheritance and how is it different from class inheritance in this programming language?", I hear you cry. Let's define the term by counter-example first: interface inheritance or typeclassing is a form of inheritance where a "subclass" only inherits a set of behaviours from the "superclass"; structural inheritance is then a form of inheritance where both behaviours and the form of a superclass are both inherited.
Because we're talking about an implementation detail, let's try to make this very concrete: structural inheritance (as I am talking about it) means inheriting the memory layout of the superclass, such that a pointer to a subclass instance can be directly used as a pointer to a superclass instance: reading inbounds offsets from such a pointer will read exactly the same (kind of) data regardless of if the pointer points to a subclass or superclass instance.
This form of inheritance is what I will rail against today. Now, you've been warned but I'll reiterate for safety: if you see yourself or your favourite programming language in this definition, it is merely a coincidence. In particular, languages that are not at all pointed to here include JavaScript1 and C++2. Now, let's grind that ax!
Protego!
Structural inheritance isn't all bad: it is exceedingly convenient, easy to understand, and works great for many things. A commonly cited example is GUIs and how excellent inheritance is for modeling them, with structural inheritance being implied by this praise.
A (hopefully uncontroversial) case for structural inheritance's superiority in a
GUI programming setting might come from defining the position of an element in a
GUI. The element needs to be drawn somewhere, so left, top, width, and
height properties of one form or another are simply required for all elements.
It then makes sense to put these properties in the base class of the GUI
library.
class BaseElement {
left: u32;
top: u32;
width: u32;
height: u32;
}
class ButtonElement extends BaseElement {
toggled: boolean;
}
class TextElement extends BaseElement {
value: string;
}
In terms of memory layout, this sort of structural inheritance hierarchy means that classes look something like this in memory:
const BaseLayout = [u32, u32, u32, u32];
const ButtonLayout = [...BaseLayout, boolean];
const TextLayout = [...BaseLayout, string];
The base class fields are placed at the start of the subclass' memory layout; note that the fields of the subclass are at the very end, despite probably being the most important fields of the class. It isn't a given, but it can be slower to load far-off fields.
This all looks pretty fine. Let's keep going then.
Reducto!
If we think about a text element in a GUI, it's fairly common that it will only take up the amount of space that it requires, up to some maximum at which point it gets cut off with maybe an ellipsis added at the end. Now, how would we do that in our GUI here?
The first obvious step is to introduce an extra wrapper around a TextElement:
that wrapping element defines the maximum space that the text can take, after
which ellipsis appears. Our TextElement itself then probably still has a
user-definable position somehow, but its width and height values become
dependent on the actual text value. Depending on the details of the system, at
this point the actual width and height properties might become unused as it
may be cheap enough to simply calculate the resultant size of the text from the
string dynamically. Even if that is not the case, the properties at the very
least become merely caches of calculated values: they are not the source of
truth that they once were.
Letting that simmer, what if we introduce a flexible box or grid layout element into the GUI? The properties of our base element class might become entirely irrelevant when placed in such a layout, yet they remain an integral part of the memory layout of each instance. That is memory being used to store no information whatsoever, just pure waste.
This isn't a deal-breaker of course: we can introduce a further base class
(maybe call it a Node?) that doesn't have these properties at all and can then
inherit from that to avoid the properties when we know they're not needed. This
is all just engineering and trade-offs, nothing fundamental that couldn't be
fixed. But at least we found an issue in the way we built our inheritance
hierarchy, if nothing else.
The Curse of Structural Inheritance
Reusing the original GUI example, let's say our GUI gets an
BinaryElement extends BaseElement subclass that displays 32 bits in 0s and 1s
using a monospace font: calculating the width of this element is trivial, simply
multiply the monospace font's character width by 32. Now, it is clear that the
width property from our BaseElement is going to be unnecessary: we really
don't need to store the result of this calculation in memory, and we'd much
rather reuse that width property's memory to store the 32 bits that the
BinaryElement is displaying. Yet, this cannot be done.
Our BinaryElement is clearly an element in terms of interface inheritance: it
has ways to display some data in some location on the screen. But it is also
better defined, or has a more limited use-case, than a general element is: its
contents is known to be a binary number of 32 bits, and its size is known to be
a static value (modulo the monospace font). Yet, no matter how hard we try, we
cannot make the BinaryElement smaller than a BaseElement is; rather it must
be bigger to fit the 32 bits it displays.
Now, it would be fair to say that the inheritance hierarchy here is badly designed, wrong, doesn't actually make sense, and has nothing to do with object-oriented programming in the first place anyway. Also, wasting the 8 bytes of the width and height properties is meaningless in the grand scheme of things and thus the larger waste of time is thinking about this whole issue at a all. And maybe it is, but that is not a universal truth: this is engineering, and it is all about trade-offs. In a real-world inheritance hierarchy you'll probably find that much more memory is being wasted, and the effect of it is not insigificant. At some point you'll find that the cost of not thinking about this issue is too high to bear.
This is the Curse of Structural Inheritance that I now offer for you to understand. Whenever you look at a structural inheritance hierarchy, you will see the effects of the curse: you will see the subclasses carrying around all of the superclasses data fields at the start of their memory layout, while subclass fields are relegated to the very end of the instance memory. You'll see the subclass often barely even touching the superclass fields, as many of its behaviours have been overridden in such a way as to make the generic superclass fields' data redundant. You will see all this, yet be unable to do anything to help it. It is already too late to help it.
Structural inheritance in action
Okay, maybe structural inheritance has a downside when modeling behaviourally
related but otherwise unconnected objects. But what if the data is very clearly
linked, surely then structural inheritance is great? This is the question I was
asked recently: specifically, I was asked why I had not written two classes,
GraphReadWrite and GraphReadOnly, using inheritance but had instead opted to
create two entirely separate classes that largely shared the same internal field
structure and copy-pasted a good bit of methods on top! Surely this is a case
for inheritance?
The read-write class I wrote is used to create a graph step by step, and once
the creation is done it is "frozen" to produce a read-only instance: these
classes could alternatively be called a GraphBuilder and Graph, but our
GraphReadWrite still internally contains a graph. So while the graph inside of
GraphReadWrite may be incomplete, it is still a graph with nodes and edges and
methods to read them just like the final GraphReadOnly has, not a builder with
just add methods and a final build step that does all the work. So surely it
is stupid to duplicate code across the two classes? I had to pause to really
consider this question, but I did indeed find that I had an answer. Let's
approach it by trying to write out the inheritance hierarchy.
Oh, and do stop me if you know the answer already: it seems that trying to model read-write and read-only class variants through inheritance has already been tried in enough places so as to see that it is a fool's errand. I of course had not found this out yet and so had to find my own answer.
So let's start by doing the logical thing and inheriting
GraphReadOnly extends GraphReadWrite. This seems clear and simple, we have a
superclass that includes all the fields and methods for storing a graph and
making changes to it, and our subclass simply makes those fields readonly and
overrides all the methods of the superclass to throw exceptions if called.
Except that the subclass cannot re-interpret the superclass fields, not even by
changing them from read-write to read-only: it can only pretend they are
read-only but it cannot actually change their functionality. Furthermore, a
subclass is a valid instance of a superclass, so any superclass methods can be
called on a subclass instance: this means that the read-write methods of
GraphReadWrite can still be called on a GraphReadOnly. The read-only
instance we have is not actually all that read-only, we're just trying to
pretend it is.
And I'm sure you saw the effects of the Curse of Structural Inheritance as well:
our superclass has fields "for storing a graph and making changes to it".
There is extra data in the GraphReadWrite superclass that is unnecessary when
we know that the graph is read-only (specifically, this was a couple of hash
maps for efficient insertion). The GraphReadOnly is paying the cost of
read-write features it never wants to use, while also having no actual
guarantees that it truly is read-only. Obviously this is no way to build a
reliable system, so we cannot go this route.
So, what we have to do is the opposite: GraphReadWrite extends GraphReadOnly.
Now everything works wonderfully, the GraphReadOnly superclass only includes
those fields that are needed to store a graph and none of the fields needed to
mutate it efficiently, while the subclass gets to introduce those extra fields
as needed (never mind that it still only has to add them at the end of the
memory layout despite them probably being the first thing accessed when a write
API is called). GraphReadOnly also gets to define its fields as read-only
while GraphReadWrite then mutates them inside its own methods. ... Wait,
what?! Oh right! Now the subclass has to break the read-only invariants of the
superclass to do its work: depending on the language this might appear as
const-casting or // @ts-expect-error comments. Any methods defined on the
superclass that rely on the read-only nature of these fields for correctness are
liable to bug out if any re-entrant code paths appear with subclass instances.
The language fights the implementation, and possibly the best way to resolve the
Gordian Knot is to simply give up on defining the fields as read-only in the
first place. At that point you are left with a GraphReadOnly class that is
read-only in name, but with no actual guarantees given. The entire point of the
class has arguably been lost in our attempt at using inheritance to model
read-only and read-write variants of the same data.
A third option would be to define a common base class between GraphReadOnly
and GraphReadWrite and have both classes inherit from that: depending on the
language this could remove some issues but I at cannot directly recall a feature
of inheriting a superclass as read-only. In effect, the common base class would
still need to be defined as read-write and GraphReadOnly would again have to
live with being read-only in name only.
At the end of the day, structural inheritance is not a tool that I like to use all that much. When in doubt, I prefer interface inheritance and when I need structural sharing, I tend to opt for either structural composition (including a struct in your own definition; this doesn't exist in JavaScript), indirect composition (including a pointer to a struct; this is equivalent to storing an object inside an object in JavaScript), or a manual mix of the two.
Maybe consider doing the same the next time you see the Curse of Structural Inheritance lingering in your code base?
Footnotes
-
While JavaScript's prototypal inheritance is more akin to interface inheritance, the
classkeyword and field definitions especially but also traditional constructor functions that assign properties tothisare a form of structural inheritance. ↩ -
The grand old man of structural inheritance and object-oriented programming as we mostly know and love it today! My expertise isn't here, so I should shut up but I'll just point out that non-abstract C++ classes are explicitly a form of structural inheritance. ↩