Time reversible events
The current state of a system might be represented by the contents of a database table. The table could have many columns of various data types, but to simplify we'll say there is only a single integer column, so our table is just a list of integers. (Each integer could in reality be a foreign key into another table holding immutable and distinct tuples, each describing a frozen configuration of a more interesting entity such as a person, a medical record and so on. So we can make this simplification without loss of generality.)
We can describe the history of the table's state by creating another table in which the rows represent events. Of course in a database we need to think about atomic transactions; some kinds of change may not make sense by themselves and must always occur atomically as part of a transaction along with certain related changes. Therefore an event always belongs to a batch, and a batch may contain multiple events. Batches occur in a definite order, so we can number them (that is, the primary key of a batch is a sequence number). A batch is also the ideal place to record the clock time that the events in the batch were applied.
Aside from the batch_id
an event has just two columns:
Language Smackdown: Java vs. C#
A pithy quote:
There are only two kinds of languages: the ones people complain about and the ones nobody uses.
Now you might say that's exactly what the creator of C++ would say to cover his tracks. But the point is that Java and C# are languages that are 20 to 25 years old, widely used (maybe 15 million users between them), and are both cursed with toxic corporate associations. When Java first came along it was cool, if a programming language ever could be. But this was because the only Java code in the wild was neat little animations and things like that. As soon as it became widely used for boring line-of-business apps, it began to be thought of as the new COBOL.
Domesday '86 Reloaded (Reloaded)
TL;DR: I built this yesterday.
Back in the mid-1980s my primary (same thing as elementary) school suddenly told us we had to come up with content for a digital encyclopedia of the UK, conceptualised as a successor to William I's audit of his freshly conquered territory carried out exactly 900 years earlier. Yes, like the crowd-sourced Internet utopia envisaged in the mid-2000s, "Web 2.0", in which ordinary people are both the content providers and consumers, and will become empowered, and definitely won't turn into Nazis and storm the citadels of democracy. All the good parts of that somehow travelled back in time to 1985 (Marty). Except it wasn't really grass-roots, it was astro-turf: the BBC conspired with the schools to make it happen, in a top-down patrician Lord Reith type of way.
Even so, to my childhood mind, it's like I'm Ford Prefect doing field research for The Guide and unfortunately I've been stranded on this pathetic little planet for slightly longer than expected and I'm not really from Guildford after all (or indeed ever).
The Blob Lottery
The simplest, cheapest and fastest form of storage in the cloud is the blob. It's very bare-bones, making no attempt to compete with more high-level searchable storage offerings that help you by making your data searchable every which way. It's little more than a remote file system. But if you can put up with those limitations, you can save $$$.
Today I'm going to consider the question: if we have a dataset that we want to store in the cloud, how far should we go in breaking it down into pieces? However we decide to organise the data (indexed, sorted or just however-it-comes), there are good reasons to want to break it into pieces. Regardless of any other choices we might make, I want to see what impact this "granularity" decision will have.
My particular use case involves a dataset of many millions of items, of which thousands are updated during a nightly "processing run". A naive first guess is that I should arrange cut the data into small enough pieces so that each of these nightly batch updates is required to read and write a minimal subset. The fewer raw bytes I have to transfer over the network, the faster my process should go, right?
Abstraction is a Thing
When aliens finally pay us a visit and they start floating around our cities, struggling to pronounce greetings from a phrasebook, we will no doubt say to one another, "Apparently aliens are a thing now." When we recognise something has started happening all the time, we call it "a thing". Or we might remind our friend in a tone of heavy irony, after they accidentally walk into a lamppost, "Yup, lampposts are still a thing."
Of course, deep down every "thing" is just subatomic particles and forces. There is nothing else. Except of course there is! It's a frustrating thing about casual pop science explanations that they stray into that kind of obsessive reductionism. Things don't stop existing just because you found out what they're made of. No one seriously stops referring to chairs and tables when they learn about atoms. I can't put it better than Stephen Pinker:
Good reductionism (also called hierarchical reductionism) consists not of replacing one field of knowledge with another but of connecting or unifying them.The building blocks used by one field are put under the microscope of another.
Unfortunate Bifurcations
Although this is going to seem like a series of picky complaints about C#, really it's about how any language has to evolve, and is a compromise between past and future, and the whole thing is quite difficult.
Also some speculation on what the future of language interoperability will be.
The kind of problem I'm going to pick on is where languages separate two concepts and treat them differently, making a virtue of the differences, but then it becomes a pain dealing with them generically. The language designers seem to be saying "You shouldn't need to treat these two things the same; they're fundamentally different. You're doing it all wrong!" And yet…
Two Cheers for SQL
What is there to say about this old stain on the technology landscape? Settle in…
SQL is not "cool". It probably never has been. On the one hand there are the technologies we hate, and on the other the technologies no one uses.
Having spent a few years going back and forth on the merits of SQL, I'm in a weird place. I now think it is both underrated and overrated, and not merely because other people are too extreme in their opinions. I genuinely think SQL is both a fine idea and a terrible idea at the same time. There is a way of using it that makes sense, and many other ways that don't.
Factory Injection in C#
Update - There was a nasty bug in the original version of this post! Where I previously registered the factories with AddSingleton
, I now use AddTransient
, and I call out the reason for this below.
The modern C# ecosystem (based on dotnet core, due to become .NET 5) enjoys a standard dependency injection system that is, despite its minimalism, is pretty much all you need.
In some ways the ideal dependency injection system is nothing at all: isolate your components by writing an interface/class pair, and make each class accept interfaces to give it access to whatever services it needs. Very often the sole reason for the existence of the interface to go with each class is so that it can be mocked out in unit tests for classes that depend on it. (It's worth noting that in languages based on dynamically typed runtimes there is typically no need to do this - it's especially irksome to see this pattern being imported unnecessarily into TypeScript, where every class is already an interface.)
Hangfire - A Tale of Several Queues
If you've used Hangfire you know it's a really quick and easy way to give your app a queue of durable background jobs, with automatic retrying and a very nifty dashboard to let you see what's happening right now. Jobs can trigger further jobs and so a complex series of processing stages can be decoupled and spill out into a queue of little units of work.
You can setup one database (such as Redis) to store the state of all your jobs, and then multiple identical workers can attach to that database and munch through the jobs, taking them through the lifecycle:
[Enqueued] -> [Processing] -> [Finished]
How Does Auth work?
Abstract: Authentication is figuring out who someone is, and authorization is concerned with what they are allowed to do (or any other useful information about them). The basic approach is straightforward, but it becomes more useful and interesting when you consider many separate services that all need to collectively accept requests from the same users.
From Ember to React, Part 2: Baby, Bathwater, Routing, etc.
Abstract: Last time, which was too long ago, I explained why Ember is terrible and must be burnt to the ground. This time I'll begin to explain why it's not actually all terrible and we should run back into the burning building to rescue the good parts. This will lead us to answer the question: can React Router be used with MobX?
From Ember to React, Part 1: Why Not Ember?
Abstract: We just replaced our entire Ember codebase with a new one written in React, TypeScript and MobX. It was a pretty engrossing couple of weeks. THIS IS OUR STORY.
json-mobx - Like React, but for Data (Part 2)
This is a follow-on to MobX - Like React, but for Data, in which I noted the parallels between MobX and React.
- A
computed
"renders" a "view" of some data, and automatically updates when the source data changes. Like a React component, except generalised to cover any data, not just Virtual DOM. - An
observable
is like thesetState
facility in stateful React components, except that its automatic ability to notifycomputed
(andautorun
) observers works by spooky "action at a distance" and so doesn't have to take place inside one component.
But this still leaves one major feature of React unaddressed, and that is reconciliation. What is this about, and how can it be useful in a more general way in MobX?
Redux in Pieces
Last July I noted down my thoughts on Redux with some hints of the concerns that eventually led to Immuto.
I've since rediscovered my love of observable
and computed
via MobX, which is like the good parts of Knockout.js made even better by a very careful, thoughtful implementation.
Even so, this is not the same thing as abandoning immutability and purity. There's nothing stopping you using those techniques within a system of observables. Indeed bidi-mobx abstracts away all mutation and allows entire UIs to be declared from pure expressions. The data transformation is carried out by objects called adaptors that contain pairs of pure functions between View
and Model
representations. Only the user gets to do mutation!
Box 'em! - Property references for TypeScript
This concerns quite an abstract, simple building block, but it is a neat tool for use with React and MobX. In MobX there's a utility observable.box
(docs). But I don't want to use that create all my properties and have to put .get()
after every read access. I want to use the cool @observable
decorator and just fetch my properties directly, and assign new values with =
. What I need is a way to box a property. Oh, and it better be statically type checked in TypeScript.
For the overall idea, see the project page, or just look at the takeaway:
TypeScript - What's up with this?
JavaScript's this
keyword is horrible. The value it assumes inside a function depends on precisely how the function is called:
MobX - Like React, but for Data
Catching up on blogged opinions about MobX and where it fits in (especially in relation to Redux), I see much confusion. There is a suspicion of it arising from fear of mutability. It has none of the frameworky ceremony of Redux, and that seems to cause anxiety in some.
Even its defenders seem a little apologetic, like MobX is okay despite the heresy of allowing data to be mutable and object-oriented. The great Basarat even humorously welcomed me to the dark side!
I'm fine with being on the edgy team. You'll usually find me in my leather jacket and shades, posing on my parked Harley Davidson and chewing on a matchstick, intimidating the townspeople. Why? I don't have to explain myself to you, lady.
Eventless - XAML Flavoured
About four years ago, being so taken with data modeling approach used in Knockout.js, I wanted to recreate it for C#. At the time I wasn't actively using C# so I never got to really use it and left it alone.
But in the last year and a half I've written a few view models for a WPF application. The first time I did it I couldn't believe how primitive and laborious it was in comparison. So I started idly messing with Eventless in my spare time - mostly deleting stuff - to make it XAML-friendly.
Just like Knockout, and now MobX, it makes the process delightfully simple. You just declare stuff and it works!
Immuto - Epilogue
It's been a couple of months since I had a scrap of time to do anything with Immuto - I've been up to my knees in WPF/C# instead (working for a living).
This break has given me a new perspective (aside from the obvious one that WPF is yucky). The executive summary is that I don't see myself ever using Immuto seriously. The way I look at it now is almost as a satire on the rigid idea of "single reducer function for the whole application state". It wasn't intended that way! I was genuinely into it and was expecting to use it in my job. But now it looks very different. And as Immuto is just a flavour of Redux, it's a broader comment on Redux itself.
What do I mean by a satire? I mean it's like I was trying to show the absurdity of something by pretending to take it seriously. (Except I was taking it seriously). My dad told me a story from around 1969 when he went to a conference. The latest hot debate topic at the time was Goto Considered Harmful, and some speaker put some source code on the overhead projector and invited the room to critique it. Hands went up and all the suggestions were to get rid of the GOTOs, of course. So as a group they began editing the code to try and get rid of the GOTOs and be good Structured Programmers, and the structure of the program become more and more absurd and unreadable as the exercise progressed.
Immuto - Radical Unification
Immuto continues to evolve rapidly. To ensure that I comply with Semantic Versioning, in which major version 0 implies an unstable API, I've been making major breaking changes every day or so.
The major shift since the first write-up is left-to-right cursor composition. Example - here's the signature of a function that gets a book from a shelf:
Immuto - Working with React (An Example)
UPDATE - I'm in the move-fast-and-break-things phase so a couple of details in here are already out of date. In particular, properties are now unified with cursors. See the various repos for details.
In Immuto - Strongly Typed Redux Composition I introduced the Immuto library by coyly describing a wish-list of features, as if I hadn't already written the darn thing. Shucks!
What I didn't do was show how to make a working UI in React, using Immuto to define all the actions and the structure of the store. The missing piece is another package:
Immuto - Strongly Typed Redux Composition
What's good about Redux, I once asked, and I answered with a few things. Like React, it is one of those rare outbreaks of sanity that happen now and then. Read the docs, they're easy.
There's very little to the library (which is a good thing), because the main thing it implements is the store, which in its basic form is a very simple idea. I noted before how it says very little about composition patterns. I want ways of plugging reducers together, but with complete static type safety, so that it is not possible to dispatch the wrong kind of action, or an action whose data is not of the right type.
One composition feature is combineReducers
, which from a static typing perspective leaves us nowhere to go. Sometimes this happens because TypeScript is lacking some capability, but sometimes it's just because the library has done something undesirable and I think that's the case here, for reasons I will now go into at great length.
TypeScript - What is a class?
In TypeScript, a class is really two types and a function. One type is the type of the function itself, and the other type is the type of the object returned when you call the function. Try this:
TypeScript and runtime typing - EPISODE II
Prompted by a revealing comment from Anders Hejlsberg.
Something wonderful happened between typescript@beta
and typescript@rc
(i.e. just in time for version 2.0).
Way, way back in TypeScript 1.8 (February 2016!) we gained the ability to use string literals as types:
TypeScript and runtime typing
Prompted by this question on Reddit.
I'd want to declare a type that points to class extending another class. Please note, a CLASS not INSTANCE. I've tried something like this:
type EventClass = class extends Event;
type Listener = (data: class extends Event) => void;
and later on:
private handlers: Map<EventClass,Listener[]>;
But unfortunately this syntax does not work. How I can declare a type that points to CLASS extending another CLASS?
You want a runtime value that specifies a type of event, so you can use it as the key in a Map
.
What's good about Redux
Redux is based on series of really simple what-if questions:
- What if all the data in your app was immutable?
- Okay, now it's stuck. But what if there was only a single solitary mutable variable holding the complete state for your entire app? To change any bit of state, you just assign a slightly different immutable tree to that variable.
- And what if the only way to mutate the state was to create a POJO describing a high-level action, and dispatch it through a single giant processing system, describing the change to make?
A number of interesting advantages follow from sticking to this discipline. It's ideal for ReactJS. You can log everything your users do and replay it, stuff like that. You can store the state snapshots, or just the actions, or both. You can recover your app by loading in an old snapshot and then playing the recent actions to bring it up to date. If you want to know the complete story of how your application ended up in the state it's in now, you've got it. And aside from these nice capabilities, it's worth remembering a lot of bugs arise from fiddling with mutable state at the wrong time. Who needs that?
TypeScript multicast functions
Just as in JavaScript, C# functions are first class entities - you can pass them around in variables. There are two ways that C# differs from JavaScript.
a method's
this
reference is automatically bound to the object it belongs to. In JS a "method" is just an object property that happens to contain a function. If copied into a separate variable and then called, there may or may not be a problem depending on whether the function internally refers tothis
.a function value (known as a "delegate") has operators
+
,-
,+=
,-=
that allow it to be combined with other compatible functions to create a new single function that, when invoked, causes the constituent functions to be invoked.
Introducing doop
As great as Immutable.js is, especially with a TypeScript declaration included in the package, the Record
class leaves me a little disappointed.
In an ordinary class with public properties we're used to being able to say:
TypeScript is not really a superset of JavaScript and that is a Good Thing
Questions:
- What does it mean for a programming language to be a superset of another programming language?
- What's a programming language?
- What's a program?
In this discussion, a program, regardless of language, is a stream of characters.
A new kind of managed lvalue pointer
It's already the evening and I haven't yet added anything to the C# compiler today, so here goes!
Properties have special support in C#, but they are not "first class". You can't get a reference to a property and pass it around as a value. Methods are
much better served in this regard: delegates are a way to treat a method as a value. But they are just objects with an Invoke
method.
So all we need is an interface with a Value
property. Objects supporting that interface can represent a single property that can be passed around like any other value:
Using pointer syntax as a shorthand for IEnumerable
Another quickie extension to C#. In the current language, a type declaration T!
is shorthand for Nullable
.
But equally important in modern C# programs are sequences of values, so a similar shorthand for IEnumerable
would be ideal.
The asterisk symbol is underused (you can suffix a type with asterisk to make a pointer, but only in unsafe contexts), and
this was the choice made by the intriguing research language Cω
that influenced LINQ, so let's copy that:
Adding crazily powerful operator overloading to C# 6
I'm going to show you how to enable a new kind of operator overloading by adding exactly four (4) lines of code to a single file in the C# 6 compiler preview. Yes, I was surprised too!
After seeing the video of Anders Hejlsberg showing how easy it is to hack the new open source C# compiler, I had to give it a try.
My aim was (I assumed) a lot more ambitious and crazy than his demo. I thought it would take ages to figure out. But it was still tempting to aim high and actually implement a substantial new feature, because there are a few I've been wondering about over the years.
Introducing Carota
This project was a lot of fun, but it has a big roadblock that has to be overcome by any text-based project: internationalisation. Hence I don't see it being generally useful outside of Western-language-only projects without a lot more work.
I'm developing a rich text editor from scratch in JavaScript, atop the HTML5 canvas. It's called Carota (Latin for carrot, which sounds like "caret", and I like carrots).
Here is the demo page, which is very self-explanatory, in that it presents a bunch of information about the editor, inside the editor itself, so you can fiddle with it and instantly see how it persists the text in JSON. As you can see, it's quite far along. In fact I suspect it is already good enough for every way I currently make use of rich text in browser applications. If your browser is old, it will not work. (Hint: IE8 is way old.)