delan

god of no trades, master of none

dog. ao!!

Ⓐ{DHD,utistic} doggirl • bird photography, retrocomputing, speedrunning, osu, rust, (insert special interest here) • 1/6 of the servo team at @igalia • ≡ƒÅ│∩╕ÅΓÇìΓܺ∩╕Å <3 @ariashark @bark

acabzettaiwebpassion
tygsunxenia
monofurnow

a
wawawawawawawawawawawawawawawawawawawawa

web (plus atom feeds)
shuppy.org/
you may also know me as
www.azabani.com/

prophet
@prophet

A lot of people think that functional programming is mostly about avoiding mutation at all costs.
Even though persistent data structures are great and there is definitely some truth to it, this view just doesn't really hold up in reality.

Many data structures fundamentally require some form of mutation (e.g. union find) and even something simple like a persistent sequential data structure that allows both fast appends and fast random access is orders of magnitude more complicated and still slower than a naive dynamic mutable array.

So really, functional languages need to allow mutation in some way if they don't want nearly every program to suffer from completely unnecessary overhead in terms of both time and implementation complexity.

Existing languages have a few options here but I don't think any of these are good enough.

Option 1: Give up

This is the most common and obvious option: just allow programmers to use mutable data structures like they would in an imperative language.

While this has the advantage of being relatively straight-forward and quite flexible, I really don't think it is the way to go.
The fundamental problem with this approach is that arguably the main reason to use a functional language in the first place is that you don't want to have to watch out for accidental side effects and accidental mutation.

But if your language has unrestricted mutable data structures, the desire not to worry about mutation outweighs the benefits of using a mutable data structure so they will typically only be used as a last resort and mostly in really simple scenarios where it is easy to manually verify that no mutation leaks outside some interface boundary. (a function, module, etc.)

For example, let's say you're iterating over some structure and collecting your results in a sequence. The most efficient data structure to use here would be a mutable dynamic array and in an imperative language that's what pretty much everyone would use.

But if you asked an OCaml programmer, they would almost certainly use a linked list instead. This isn't because they're stupid or they don't care about performance but because the additional mental overhead of worrying about mutation outweighs the benefits of using a more efficient data structure.

And that's a language failure! A good programming language shouldn't make you write worse code just because it's too limited to express the better alternative nicely.

Option 1.1: Give up but only in IO

If your language has some way of tracking side effects, you can give programmers full access to mutable data structures but only in a context where they can already use unrestricted side effects.

Regardless of which other option a language chooses, I think having some form of unrestricted mutation for those cases where you do need the flexibility is important and in my opinion, this is the best way to achieve that.

That said, because using mutation is now reflected in the function's type and infects all call sites, the problems from the previous section are massively excacerbated so this option is only suitable as a last resort or for programs that already mostly run in IO.

Option 2: Locally sourced mutation only

One way to make mutation compatible with functional programming is to make sure that it is limited in scope and that no mutable state ever escapes a region. This is how Haskell's ST Monad and Koka's st effect work. If no mutable state can leak, this means that the mutation becomes an implementation detail and from outside the region where mutation is allowed, the computation behaves exactly as if it were completely pure.

If you know me, you might know that I really like this approach in principle.

But I almost never use it in practice.

Part of the reason why might just be the Haskell implementation, which employs very little dedicated language support (as opposed to something like Flix that presumably does a lot more inference), but in any case, annotating lifetimes is tedious enough that if you have something mutable inside a lot of structure, it's often easier to write the module that uses it in IO and make sure manually that it doesn't leak.
And if you only need mutation locally inside a function, using ST makes your code fundamentally more imperative in a way that really forces you to change your programming style.
This isn't great either and doesn't exactly help with readability, so the mental overhead is rarely worth it.

Option 3: You can only read this section once

Another way to use mutation without compromising on purity is to make sure you're never reusing a value after you have modified it. This is called linearity1.
In this case, there is no way to observe whether a data structure is persistent or mutable because to find out you would have to look at it after you've modified it!

So now all your linear data structures can have APIs that look exactly like their persistent counterparts (aside from some linearity bookkeping) but internally they will mutate the data structure in-place.

In theory, this sounds amazing and it is probably the best option presented here (also currently the least well-supported one in practice) but it still has problems.

One issue is that it's not always easy to write linear code. For example, while needing exclusive access to write to a linear data structure is quite natural, you'll typically want to be able to share it for reading. There are nice solutions to this, such as Rust's shared XOR mutable references2 but once you go down that path, you add some complexity and lose the ability to write code as if your data structures were fully persistent.

But what's much more problematic in my opinion is that linearity is extremely infectious.

You can add linearity as an opt-in feature (like in current Haskell) such that you only need to worry about it if you need it. Unfortunately, this makes it impossible to use any standard functions like map on linear values and either makes linearity nearly useless or inevitably creates a parallel, incomplete universe of functions that also work on linear values.

Alternatively, you can make linearity the default and give every standard function a type that respects linearity. Now linearity works great and is really easy to use! However, now everyone needs to worry about linearity all the time since e.g. changing a function from using an argument linearly to non-linearly is a breaking change. This also means that to be compatible with the rest of the ecosystem, every polymorphic function needs to be written as if its arguments were linear.

The same problem occurs with tracked effects although I think it's a lot easier to justify there since effects are something you'll want to think about in a functional language anyway.

If you go down this path you might even run into this issue with linearity and effects at the same time.

Option 4: Functional farming

After the last section, you might have had an idea: if explictly tracking linearity is difficult but most programs that would benefit from it are effectively linear already, why don't we implicitly track linearity at runtime and just perform a copy if we have more than one reference?
Well, that's exactly what Koka's Functional But In-Place (FBIP) and Swift's Copy-on-Write (CoW) data structures do!

The advantage of this is that it's pretty much invisible from the surface. This time, your program actually looks exactly as if it were just using persistent data structures so you won't run into any of the virality issues you have with linearity.
Koka can even infer a lot of this for you and perform in-place updates on seemingly immutable data structures.

This might sound a little dangerous since accidentally holding on to a reference could turn a linear time algorithm quadratic, but it's actually a lot less problematic than it sounds!
If there are two references to a data structure and one is modified, it will create its own unique copy and let go of the previous one, so any further modification to either reference can be performed in-place again.

Unfortunately, this approach has a different, much more fundamental fatal flaw, that makes it unsuitable for most programming languages: It relies on reference counting.

If you want to be able to tell that a data structure is used more than once at runtime, you need some way to keep track of references. A tracing garbage collector just doesn't give you this sort of information.

There are some nice efforts into making reference counting more competitive for this kind of workload but at least for now even these struggle to keep up with tracing garbage collectors even when factoring in automatic reuse analysis.

Reference counting might work for some languages, especially ones that don't care much about parallelism, but it's just not a realistic default.

So, what do we do about this?

Honestly, I don't know.

Linearity is the most promising option out of the ones presented, but unless we find an answer to the virality problem, I doubt we will ever see it in more than a few semi-popular functional programming languages.

I think there is room for a fundamentally new approach here, I'm just not sure what that would look like.

Alternatively, some way to integrate regions more nicely with regular functional programming would probably help a ton.

A capable solution should also be compatible with existing state management solutions. It shouldn't be easier to use a State monad/effect over a persistent data structure than to use a proper mutable data structure.


  1. There is technically a difference between linear types that make you use every value exactly once and affine types that make you use values at most once (like in rust), but the distinction doesn't really matter for our purposes.

  2. This slogan has always bothered me a little because it's not actually XOR. you can have values without any references to them! It's actually just "not (shared and mutable)" but I guess that doesn't roll off the tongue as nicely


You must log in to comment.

in reply to @prophet's post:

it feels like the artificial separation between the camps (functional vs not) means there's so much work that's just not being done on things (like this) that could be, because the hard borders have been drawn in ways where often people are not mutually intelligible until they've done a lot of both :/

I blame Haskell and the sudden fixation on absolute purity that manifested just as the VC-startup set decided it was the next secret sauce, to the point of literally redefining what "functional programming" meant in the first place.

I think that fixation has forced a lot of focus on solving for "how do we make Haskell do real work" instead of "how do we make programming in general more predictable and functional", and so the latter fight got lost to "well we bolted lambdas into Java somehow, so it's functional now".

I am genuinely bitter about this, and I don't care to hide it.

most of what people attribute to functional programming seems to be like, "good coding style warped through the lens of Haskell" but everyone gets distracted by the Haskell part and keeps inventing epicycles instead of adapting it to any given language, to be honest.

Clojure handles deals with this in a couple different ways.

  1. The persistent data structures are all versions of hash array mapped tries, so they get copy-on-write semantics by default (structural sharing) with the JVM's best-in-class garbage collector.
  2. There's Transients, "mutable" versions of the built-in persistent data structures (and a mechanism for defining ones' own). The transient versions require you to use a "bang" version of the modification functions (conj vs conj!, etc), which means it's very explicit and the transient version can't be leaked. The transient versions do modifications in-place without any copy-on-write behavior, making them significantly faster.
  3. You can always drop down to the host language. If you want a mutable hashmap, just use one directly. You can convert to a PersistentHashMap once you're done, or don't and continue to use the seamless interop code.

I think transients are really nice.

  1. This is the flip side of the "Option 1: Give Up" coin, where you sacrifice space (and probably time) by not using mutability at all
  2. Transients are "Option 3: Locally sourced mutation only"
  3. This is "Option 1: Give Up" :)

This reads like you're trying to refute something I've said, but I'm not sure what.

edit: i have other things to say but i want to double check your intent before I say them so we don't misunderstand each other.

Personally I read "handles this" as you saying Clojure solves this problem, but then listing 2 things tantamount to giving up on the problem (not solving it), which is what the reply is trying to refute. The reply to point 2 is a helpful clarification

There's no problem to be solved, it's an open discussion of trade-offs. All of the options described by prophet have trade-offs, none of them are silver bullets. I felt it relevant to share how Clojure deals with the problem of "changing immutable data is slow, allowing mutation is risky" as a further point of discussion.

Thanks for giving a possible read, that's helpful.

How analogous is this to borrow checking? And if it is similar, is there anything to learn or borrow from Rust? (From a distance it seems very similar, but I haven't yet properly learned/used Rust so idk if it's obviously different, literally the same, or anywhere in between.)

One of the things I notice in a lot of these discussions is that the set of algorithms that require mutation seem to be pretty limited so I'm wondering:

Would it be possible to provide a pure interface to those algorithms? Because if you could then you wouldn't need language support for general mutability and instead you'd just need language support for those algorithms.

For example, unification is one of those algorithms commonly cited as requiring mutation, but if you have a logic programming language (or functional logic programming language) then you don't need to implement unification because it's built into the language via the language's native support for unification.

a lot of this goes over my head tbh, but the most obvious solution to me appears to be: if you need everything to be pure, then why don't you make mutation itself pure? that is, the evolution of state over time is fixed in a timeless context