json-mobx - Like React, but for Data (Part 2)
This is a follow-on to MobX - Like React, but for Data, in which I noted the parallels between MobX and React.
- A
computed
"renders" a "view" of some data, and automatically updates when the source data changes. Like a React component, except generalised to cover any data, not just Virtual DOM. - An
observable
is like thesetState
facility in stateful React components, except that its automatic ability to notifycomputed
(andautorun
) observers works by spooky "action at a distance" and so doesn't have to take place inside one component.
But this still leaves one major feature of React unaddressed, and that is reconciliation. What is this about, and how can it be useful in a more general way in MobX?
What problem does reconciliation solve?
Suppose you have some source data, which may change at any time, and you want to maintain a transformed variant of it. Whenever the source data changes, you need to update your variant. Stupid example: the source data is a list of numbers, and you want a list of strings containing those numbers:
[1, 2, 3] --> ["1", "2", "3"]
By far the easiest way to do this is to throw away the old list of strings and recreate it from scratch, every time the number list changes. This is what HTML templating systems have long done, it's what your React component's render
does, and it's what computed
does.
The alternative, which is to identify the specific changes that have occurred in the input and poke corresponding specific changes into your output. This is hard - not in the above trival example, of course, but when the relationship between the input and output data is more complicated, it becomes almost a certainty that you'll get something wrong and your output data will drift out of step with the input. Much easier to throw away and rebuild.
By the way, it's tempting to label these two approaches as "functional-immutable" and "imperative-mutable". That would be a misunderstanding. The throw-away-and-rebuild approach does automatically avoid modifying old data (it discards it entirely), but the alternative can - and very often is - used in functional-immutable systems, to avoid cloning and discarding large complex data structures just because one little bit has changed. In fact it's exactly what Redux does! Unfortunately, the resulting combination of responsibilities is even worse: you not only have to correctly translate changes in input to changes in output, but you then have to "ripple up" to the root as well. Like trying to tie a knot in a tightrope while you're balancing on it. So this distinction isn't relevant here: when we talk about updating some data, we could mean actually mutating or functional-style non-destructive updating.
Okay, so what are the drawbacks of pure throw-away-and-rebuild? Well, we're creating and throwing away a lot of objects. Could be a problem (always measure before worrying too much though). More pressingly, what if the output objects we build have extra degrees of freedom? By which I mean they have their own bits of state that can change between changes to the input?
Drawing from the well of modern classics, let's say my input is a list of string describing tasks to be done. The source data says what tasks exist. I want a list of TodoItem
objects:
["Bake bread", [new TodoItem("Bake bread"),
"Impeach Trump", --> new TodoItem("Impeach Trump"),
"Cancel Brexit"] new TodoItem("Cancel Brexit")]
The interesting thing about TodoItem
s is that they have a completed
boolean, initially false
. The input strings do not. At any time, I want to be able to toggle the completed
boolean on any task, but also at any time the list of underlying strings may change.
When that happens, I don't want to lose the selection state of any surviving TodoItem
s. But that's what would happen if I used throw-away-and-rebuild. I would get a new set of TodoItem
s in which completed
is false
.
This is precisely the problem faced by React, and is far more important than concerns about performance, etc. A component can render complex stateful DOM elements, such as <video>
, alongside simple things like a <span>
showing the current playback time of the video clip. Each render
must not throw away and discard the video player's state, or your video would keep going back to the start every time the playback time ticked forward one second.
The solution, which gives us the best of all worlds for zero effort, is reconciliation. Well, I say zero effort: once a general reconciliation system has been implemented by some library (such as React), we applications authors don't have to lift a finger.
And reconciliation means that we (the application authors) can write code that feels like throw-away-and-rebuild, "rendering" a specification of the desired state, but then the library takes care of making the minimum number of spot adjustments to the current state in order to bring it into line with our newly rendered specifications, taking care not to destroy associated data. In this case, we'd render the fact that we want a specific set of TodoItem
s, and the library would add and delete actual TodoItem
objects from the list accordingly.
Approaches to serialization
Now we're going to take what appears to be a right-turn into another subject entirely. But it's not!
Serialization refers to any mechanism for turning a set of objects in memory into a flat stream suitable for storing (or sending over a wire) and later reassembling into another set of objects in memory. By far the most popular pattern for doing this has the following API:
const serializedStuff = api.save(myObject);
const newObject = api.load(serializedStuff);
So it's a way of cloning an object, via an intermediate representation that can be moved between processes. By "round-tripping" your object via some serializedStuff
, you get a separate object that is a duplicate of the original.
The MobX project itself hosts a fine implementation of this pattern.
This API earns "coolness" brownie points from the mere fact that it is pure. No objects are mutated in the making of this motion picture. And this does make it nice and simple. It also allows the type of the root object to vary freely.
But consider an alternative API:
const serializedStuff = api.save(myObject);
const newObject = new MyClass();
api.load(newObject, serializedStuff);
This API accepts an object that it will mutate to make it match serializedStuff
. In the example above the outcome is the same because I clone a new "blank" object to serve as the target. I can also deserialize back into the same object:
const oldVersion = api.save(myObject);
myObject.changeStuff();
api.load(myObject, oldVersion);
So I can roll back an object to a previous state. The load
implementation can be sophisticated. Where it's dealing with a collection, it can minimize the changes it makes to only those required to honour the specification in oldVersion
. Does this sound familiar?
Yes, the load
function is performing reconciliation. The save
function is rendering a "virtual" object model (playing the roll of a render
method), and the load
function is reconciling myObject
with that virtual object model.
So this kind of serialization mechanism is the remaining piece of the React feature set.
By the way, sophisticated serialization systems deal with things like proper graphs where two objects can refer to a third object and it takes care of serializing the third object only once. We're not going to worry about that here; we're only interested in simple tree-shaped object models.
Mobx-state-tree
Another extremely interesting project-in-progress called mobx-state-tree is currently incubating in the MobX project. It is very different from MobX itself in that it mandates a much more restricted structure and even a style for constructing objects, and also its own kind of runtime-type metadata system. So it's quite a big thing to bite into. But it has some very interesting features, particularly JSON patches.
In this context the striking thing about it is this pair of linchpin functions in its API:
getSnapshot(model)
: returns a snapshot representing the current state of the modelapplySnapshot(model, snapshot)
: updates the state of the model and all its descendants to the state represented by the snapshot
Yep, applySnapshot
is the second kind of load
function. Does it perform reconciliation? e.g. what does applySnapshot
do to an array?
@action applySnapshot(node: Node, target, snapshot): void {
// TODO: make a smart merge here, try to reuse instances..
target.replace(snapshot)
}
So yes, it looks like reconciliation is the eventual intention, even if it's not there yet.
In the introductory talk, at 44:40, Michel Weststrate makes a fantastic point: give a class a @computed
getter called json
that returns its JSON representation, and then apply this concept consistently across all the classes and collections that comprise your state tree, and then suddenly you have a very efficient way of grabbing an immutable snapshot whenever you need one. Subsequent requests for the JSON of an object that hasn't changed will return the same JSON structure without recomputing anything.
This is fantastic. It means we can have the convenience of mutating the state tree wherever that makes sense (it very often does in a UI) but if we want regular immutable snapshots of the state then we can get them, and they are effectively costless, sharing unchanged parts with previous snapshots.
And it means we already have the implementation of our api.save(obj)
- it's basically just: obj => obj.json
. Okay, onward to load
.
Computed properties can also have setters!
Oh yes. Maybe not such a well known feature, but totally legit. Extending the example from Michel's slide:
class Todo {
@observable id = 0;
@observable text = "";
@observable completed = false;
@computed get json() {
return {
id: this.id,
text: this.text,
completed: this.completed
};
}
set json(data: any) {
this.id = data.id;
this.text = data.text;
this.completed = data.completed;
}
}
Now this works both ways. And so load(obj, data)
is basically obj.json = data
.
The only downer with this is that it's going to be a pain writing classes where we have to list our properties three times: in the declaration, in the get json
and in the set json
. What a drag. So we could really use a decorator:
class Todo {
@json @observable id = 0;
@json @observable text = "";
@json @observable completed = false;
}
This simply builds a json
property, backed by a MobX computed
, that does the equivalent of the above hand-written example. Now we're really cooking.
The drawback of this (and this is a general problem with decorators anyway) is that TypeScript doesn't know that Todo
has a json
property. So for convenience we still need our load
/save
API, to perform a simple runtime check for the validity of the operation. I'll call them json.load
and json.save
to clarify that they operate on the json
property of objects. I'm not hugely troubled by strict type safety in serialization formats because, theoretically cool as it is, usually we evolve the format over time anyway.
Anyway, this establishes a general pattern. Wherever we need to finely control how an object is represented in JSON - or how it reconciles itself with some JSON instructions - we can just give it a custom @computed json
property. So with this one concept, any kind of extension or reusable utility can be added. This means we don't actually need a separate type metadata system. It's a lot thinner and more lightweight than the full mobx-state-tree implementation - less ambitious - but it means I can start using it right now and know that I'll never get stuck.
Okay, so clearly I've put this on github and npm.
Arrays and identifiers
Initially I included a Collection
class that wrapped an array and implemented json
to serialize it. This worked, but added some verbosity in regular usage. John Wright suggested building in support for arrays, and this made a lot of sense.
This is very much a MobX
-based library (json
is always @computed
), and clearly we need arrays to be observable in order to correctly update the JSON representation whenever the array changes. Therefore we extend observable arrays by adding a json
property to them. To construct such an enhanced array, we provide json.array
:
class TodoStore {
@json todos = json.array(() => new Todo());
}
What's the parameter for? It's so that the array knows how to construct a new item whenever it needs to, during reconciliation. In fact there's a shorter version for where the item can be new-constructed with no parameters:
class TodoStore {
@json todos = json.arrayOf(Todo);
}
In React the reconciliation process relies on each item in a list having a key
prop that is unique-valued within the list. It's exactly the same here, except that the unique valued property is called <id>
. Another difference is that you don't have to assign it. The library takes care of associating a unique (short) number identifier to your objects, and then saving it in the JSON representation so that it can be matched up in future reconciliations.
This means we can go ahead and rip the id
property out of the Todo
class. We don't need one. Reconciliation will work fine, regardless.
On the other hand, we might want to retain it and use a meaningful ID (say, from a database system) or a natural key from the data. If so, we can tell json.array
to use our ID property instead:
class TodoStore {
@json todos = json.arrayOf(Todo, "id");
}
The second parameter's type is keyof T
(a feature added in TypeScript 2.1), so it has to be the name of a property in the array's item type.
Undo (oh, and don't forget Redo)
The oft-claimed unique advantage of Redux, and indeed anything based on immutable data, is that your app gets an undo feature "for free". The truth of this depends on whether you regard the boilerplate overhead of Redux as "free".
But anyway, now we have effortless reconciliation and efficient snapshots, we can achieve precisely the same thing. Also because MobX already has built-in transactional updating, we can do things like batch up several operations into a single Undo/Redo step.
This is such a win for this pattern that I've included an Undo
class in the library itself. The implementation is very short but it has a subtlety.
You construct it passing the root model object of your state tree.
The enabled
property is false
to begin with. An autorun
is created in the constructor that therefore runs immediately. On this first run it just captures the current state snapshot and sets enabled
to true
.
This means that when the state tree changes, the handler runs again and this time enabled
is true, and it pushes the previously captured state to the undoStack
.
The enabled
property comes into play again when we need to execute the undo
or redo
actions (which are probably tied to some buttons in the UI). These are mutually symmetrical with respect to the two stacks: you pop from one stack to get the new current state, and you push the old current state onto the other stack. But before loading the new current state into the model, you set enabled
to be false, to avoid interfering with the undoStack
.
So there's your optimal, super easy undo/redo system, and it really is "for free". Just put @json
on the stuff you want to record.
What about polymorphism?
I've written up how this works in the README, but it should by now come as no surprise that we solve the problem of mixed object types appearing on the same array by defining a @computed json
property that takes care of it in a very simple way.
Have fun!
Those links again:
And for a working example, take a look at: