Introducing doop
As great as Immutable.js is, especially with a TypeScript declaration included in the package, the Record
class leaves me a little disappointed.
In an ordinary class with public properties we're used to being able to say:
const a = new Animal();
a.hasTail = true;
a.legs = 2;
const tailPrefix = a.hasTail ? "a" : "no";
const desciption = `Has ${a.legs} legs and ${tail} tail.`
That is, each property is a single named feature that can be used to set and get its value. But immutability makes things a little more complicated, because rather than changing a property of an object, we instead create a whole new object that has the same values on all its properties except for the one we want to change. It's just a convenient version of "clone and update". This is how it has to be with immutable data. You can't change an object, but you can easily make a new object that is modified to your requirements.
Why is this hard to achieve in a statically typed way? This thread gives a nice quick background. In a nutshell, you use TypeScript because you want to statically declare the structure of your data. Immutable.js provides a class called Record
that lets you define class-like data types, but at runtime rather than compile time. You can overlay TypeScript interface declarations onto them at compile time, but it's a bit messy. Inheritance is troublesome, and there's a stubborn set
method that takes the property name as a string
, so there's nothing stopping you at compile-time from specifying the wrong property name or the wrong type of value.
The most complex suggestion in that thread is to use code generation to automatically generate a complete statically typed immutable class, from a simpler declaration in a TS-like syntax. This is certainly an option, but seems like a defeat for something so fundamental to programming as declaring the data structures we're going to use in memory.
Really this kind of class declaration should be second nature. If we're going to adopt immutable data as an approach, we're going to be flinging these things around like there's no tomorrow.
So I wanted to see if something simpler could be done using the built-in metaprogramming capabilities in TypeScript, namely decorators. And it can! And it's not as ugly as it might be! And there's a nice hack hiding under it!
How it looks
This is how to declare an immutable class with some properties and one method that reads the properties.
import { doop } from "../doop";
@doop
class Animal {
@doop
get hasTail() { return doop<boolean, this>() }
@doop
get legs() { return doop<number, this>(); }
@doop
get food() { return doop<string, this>(); }
constructor() {
this.hasTail(true).legs(2);
}
describe() {
const tail = this.hasTail() ? "a" : "no";
return `Has ${this.legs()} legs, ${tail} tail and likes to eat ${this.food()}.`;
}
}
The library doop exposes a single named feature, doop
, and you can see it being used in three ways in the above snippet:
- As a class decorator, right above the
Animal
class: this allows it to "finish off" the class definition when the code is loaded into the JS engine. - As a property decorator, above each property: this inserts a function that implements both get and set functionality.
- As a helper function, called inside each property getter
Although not visible in that snippet, there's also a generic interface, Doop
, returned by the helper function, and hence supported by each property:
interface Doop<V, O> {
(): V;
(newValue: V): O;
}
That's a function object with two overloads. So to get the value of a property (as you can see happening in the describe
method) you call it like a function with no arguments:
if (a.hasTail()) { ...
It's a little annoying that you can't just say:
if (a.hasTail) { ...
But that would rule out being able to "set" (make a modified clone) through the same named feature on the object. If the type of hasTail
were boolean
, we'd be stuck.
There's a particular pattern you follow to create a property in an immutable class. You have to define it as a getter function (using the get
prefix), and return the result of calling doop
as a helper function, which is where you get to specify the type of the property. Note: you only need to define a getter; doop provides getting and pseudo-setting (i.e. cloning) via the same property, with static type checking.
See how the constructor is able to call its properties to supply them with initial values. This looks a lot like mutation, doesn't it? Well, it is. But it's okay because we're in the constructor. doop
won't let this happen on properties of a class that has finished being constructed and therefore is in danger of being seen to change (NB. you can leak a reference to your unfinished object out of your constructor by passing this
as an argument to some outside function… so don't do that).
And in the describe
method (which is just here as an example, not part of any mandatory pattern) you can see how we retrieve the values by calling properties as if they were methods, this time passing no parameters.
But what's not demonstrated in this example is "setting" a value in an already-constructed object. It looks like this:
const a = new Animal();
expect(a.legs()).toEqual(2); // jasmine spec-style assertion
// create a modified clone
const b = a.legs(4);
expect(b.legs()).toEqual(4);
// original object is unaffected
expect(a.legs()).toEqual(2);
Inheritance is supported; a derived class can add more properties, and in its constructor (after calling super()
it can mutate the base class's properties. The runtime performance of a derived class should be identical to that of an equivalent class that declares all the properties itself.
One thing to be wary of is adding ordinary instance properties to a doop
class. It would be difficult to effectively block this happening, and in any case there may occasionally be a good reason to do it, as long as you understand one basic limitation of it: ordinary instance properties belong to an instance. When you call a property to set its value, you are returned a new instance, and there is no magic that automatically copies or initialises any instance fields. Only the other doop properties will have the same values as in the original instance. Any plain instance fields in the new instance will have the value undefined
.
For simplicity's sake, just make sure in a doop
class that all data is stored in doop
properties.
Implementation
The implementation of the cloning is basically the one described here so it's super-fast.
I mentioned there's a hack involved, and it's this: I needed a way to generate, from a single declaration in the library user's code, something that can perform two completely different operations: a simple get and a pseudo-set that returns a new instance. That means I need each property to be an object with two functions. But if I do that literally, then a get would look like this:
// A bit ugly
const v = a.legs.get();
const a2 = a.legs.set(4);
I don't like the verbosity, for starters. But there's a worse problem caused by legs
being an extra object in the middle. Think about how this
works in JS. Inside the get
function this
would point to legs
, which is just some helper object stored in a property defined on the prototype used by all instances of the Animal
class. It's not associated with an instance. It doesn't know what instance we're trying to get a value from. I could fix this by creating a duplicate legs
object as an instance property on every Animal
instance, and then giving it a back-reference to the owning Animal
, but that would entirely defeat the whole point of the really fast implementation, which uses a secret array so it can be rapidly cloned, whereas copying object properties is very much slower.
Or I could make legs
, as a property getter, allocate a new middle object on the fly and pass through the this
reference. So every time you so much as looked at a property, you'd be allocating an object that needs to be garbage collected. Modern GCs are amazing, but still, let's not invent work for them.
So what if instead of properties, I made the user declare a function with two overloads for getting and setting? That solves the this
problem, but greatly increases the boilerplate code overhead. The user would actually have to write two declarations for the overloads (stating the "property" type twice) and a third for the implementation:
// Ugh
@doop
legs(): number;
legs(v: number): this;
legs(v?: number): any { }
The function body would be empty because the doop
decorator replaces it with a working version. But it's just a big splurge of noise so it's not good enough. And yet it's the best usage syntax available. Ho hum.
Lateral thinking to the rescue: in TypeScript we can declare the interface to a function with two overloads. Here it is again:
export interface Doop<V, O> {
(): V;
(newValue: V): O;
}
Note that O
is the type of the object that owns the property, as that's what the "setter" overload has to return a new instance of.
Using a getter in the actual doop library looks like this:
const l: number = a.legs();
There are at least two possible interpretations of a.legs()
:
legs
is function that returns the number we want.legs
is a property backed by a getter function, that returns a function with at least one overload(): number
, which when called returns the number we want.
To explain the second one more carefully: the part that says a.legs
will actually call the getter function, which returns a second function, so
a.legs()
would actually make two calls. The returned function would need to be created on-the-fly so it has access to the relevant this
, so this
is very much like the GC-heavy option I described earlier.
But it's not possible to tell which it is from the syntax. And that's quite good. Because if we tell the TypeScript compiler that we're declaring
a getter function that returns a function, it will generate JavaScript such as a.legs()
. But at runtime, we can use the simple implementation
where legs
is just a function. The doop
decorator can make that switcheroo, and we get the best of both worlds: a one-liner declaration of a
property getter, and a minimal overhead implementation.
Well, it seemed nifty to me when I realised it would work!
So this is what the doop
property decorator does: the user has declared a property, and all we care about is its name. All properties are the
same at runtime: just a function that can be called to either get or clone-mutate.