30 July 2023

Frameworks like React and Vue.js revolutionised UI programming by incorporating reactive programming, which allows writing dynamic UI code in an automatic, declarative way. However, the lineage of ideas that inspired them can get pretty muddled. In Evan You’s 2016 dotJS talk, he “instantly regretted” using the word reactivity in his title because of the “endless confusion of what reactive actually means.”

Definition policing is boring, but in this case there’s some actual interesting stuff that gets lost in the confusion. When you trace back to older ideas like functional reactive programming (FRP) from 1996, you see that most reactive systems in use today—even those that explicitly call themselves FRP—lack important abstractions which prevent writing buggy, inconsistent code.

Reactivity at its simplest looks something like a spreadsheet. A cell is a reactive value. Cells are either plain containers which you manually change, or they are formulae like SUM(A1:A9) which combine other cells together in an expression. Formulae automatically recompute when dependent cells change.

A system like this is easy to implement and has great properties. First, formulae have no side-effects, so when a cell changes, the order that dependent formulae are updated doesn’t matter because they don’t affect anything. Because of this, it’s also “glitch-free”: the whole sheet is essentially updated in one transaction; no intermediate, partial states have any effect.

However, outside frameworks, the reactivity primitive programmers are more likely to have encountered is the observable, popularised by the ReactiveX libraries. Observables are streams which can be connected together and subscribed to. Crucially, they aren’t like spreadsheets; they are far looser. In spreadsheets, each cell has a current value. Observables don’t have current values; they’re streams.

We can represent spreadsheets with observables, though, which most libraries do. We need to interpret a stream as a stream of updates to the value, or of samples of a value over time. Plain cells are just plain observables which we send values through. And to make formulae, we can use the CombineLatest ReactiveX operator, which continually collects the latest values from all its input observables, runs them through a function, and emits the result.

Problem solved, right?

Wrong! By building the system around a stream of updates, we’ve thrown out the guarantees we got from a spreadsheet.

First, updates can cause side-effects. Observables can be subscribed to with arbitrary functions, which could interfere with each other: this means that the order they run in is important and can lead to different results. Picture two subscriptions which send values through the same observable: their running order now affects the order of events going through the system.

Second, because of this, the system is no longer glitch-free. Suppose we have a formula with two inputs, like A1 + A2. If both A1 and A2 change in response to the same thing, their updates still happen one at a time, so it causes two separate updates to be emitted from the formula observable. The first of these formula updates is an intermediate state which shouldn’t be seen; only one of A1 or A2 has its new value, even though both were supposed to change at the same time. Subscriptions downstream of the formula might do things based on this invalid state.

There are a variety of hacks libraries use to work around this, but none really plug the hole. We could, for example, only collect subscriptions as we update, then batch run them all at the end. But usually the update process in observables libraries is built on subscriptions; so now we need two kinds, an instant subscribe and a deferred subscribe, and maybe manual batching control, and it just turns into a bit of a mess. Users have to remember to make their subscriptions follow specific rules to avoid bugs (see React’s useEffect or MobX’s Reactions).

Even if you do patch over glitching, you’ve still got one unsolvable problem remaining: sample-rate dependence. Remember earlier we said we were representing values as streams of samples? This isn’t just sleight of hand; it has important consequences for what we can do with a stream.

Being sample-rate independent means that, if we sample more or less frequently, it shouldn’t change the meaning of anything in the system. Of course the actual samples will be different, but the idea is that sampling more frequently should converge us closer and closer to the true, continuously-varying value. For example, a formula which counts the number of input samples is not sample-rate independent. Neither is a formula which reduces/folds/accumulates samples (consider counting as a special case of reducing).

Unfortunately, very few reactivity libraries put the appropriate guardrails on, which enables users to write meaningless code. BaconJS, for example, has a type representing a reactive value, but lets you reduce it and subscribe to it.

To fully avoid these problems, a library needs to do two things:

  • Properly distinguish two types: continuous values or “signals”, and discrete event streams. These cannot be subclasses of the other; signals are not just events which save their last values. Signals can be used in formulae, but events can’t. Events can be counted, but signal samples can’t.

  • Disallow or discourage arbitrary subscriptions. Use other techniques to respond to changes, such as more-advanced primitives like the integral or impulseIntegral.

For an example, check out Yampa for Haskell. Unfortunately though, these ideas don’t seem to have made it big in the Web space, or even non-functional languages. Could it be a useful improvement? I’m experimenting with the idea in Python at refs.py and refs_gl.py.