• she/her

Principal engineer at Mercury. I've authored the Dhall configuration language, the Haskell for all blog, and countless packages and keynote presentations.

I'm a midwife to the hidden beauty in everything.

💖 @wiredaemon


discord
Gabriella439
discord server
discord.gg/XS5ZDZ8nnp
location
bay area
private page
cohost.org/newmoon

Today was an extremely productive day at work because I got incremental Haskell builds working using Nix/Nixpkgs. I still need to polish up what I have a little bit, but this is pretty close to completion.

The background for this is that were two main approaches we were considering at work:

I initially tried the second approach (ghc-nix) since it seemed promising and generalized better, which I covered in the following previous posts:

However, the performance bottleneck ended up being a real problem, so I switched to approach 1 (Nixpkgs support) and I got that working today.

This consisted of two branches, the first of which is a branch that adds the Nixpkgs support for incremental builds:

https://github.com/NixOS/nixpkgs/compare/master...MercuryTechnologies:nixpkgs:gabriella/incremental

This one was fairly easy: I just cleaned up what Harry Garood and @leftpaddotpy had already implemented.

If you didn't already read the post I mentioned above the basic way it works is that you have to create two builds of your Haskell package:

  • An older full build
  • A newer incremental build (that uses the build products from the older build as a starting point)

The change to Nixpkgs adds a new .dist output that packages the older build's dist/build directory so that the newer build can use that as a starting point, so that it only has to build what changed since the older build.

The idea is that the older build is updated infrequently, but often enough that the "diff" between the old and new builds doesn't grow too large.

However, there is a huge gaping hole in this user experience: there isn't a great way to automatically specify what the older build should be. For example, suppose that you just pin the older build to a specific revision: eventually the "diff" between the old and new build will grow so large that you don't benefit from using the old build products for the incremental build. Eventually the incremental build approaches a full build after they diverge enough.

You could add some out-of-band automation to automatically update the reference to the old build, which is what this blog post attempted to do. The idea is that you can add the old build as a Nix flake input and then use Nix's support for updating/re-locking flake inputs to periodically bump the older build.

However, this was not satisfactory for me because I'm not a fan of out-of-band automation (especially when it comes to CI); I like to push as much logic into Nix as possible.

The user experience I actually wanted was something like this:

pkgs.haskell.lib.incremental
  { duration = 7 * 24 * 60 * 60; }
  pkgs.haskellPackages.foo

… which would do a full build of the foo package once a week and then incremental rebuilds after that point relative to the last full build.

So the idea I had for implementing that was to do something like this:

  • Assume the existing src input for the package is a git repository (and fail otherwise)
  • Replace it with a snapshot of the same repository except at an earlier point in time truncated to a certain time interval (e.g. a weekly boundary or daily boundary)
  • Use the latest snapshot for the full build
  • Use that as the input to the incremental build

However, this is difficult to do using Nix/Nixpkgs in their present state. Specifically, the second step (replace a git repository with an earlier snapshot) is technically possible but requires doing a whole bunch of undesirable stuff (like disabling the sandbox and import-from-derivation) and even when it "works" it is still brittle. Basically, it would be extremely unlikely that Nixpkgs would accept a PR for the evil things that this would entail.

However, there is a simpler and more principled solution to this, which generalizes better: extend builtins.fetchGit to support an optional date argument that accepts anything that git accepts (e.g. 1 week ago, 2000-01-01, or a unix timestamp). If you have that then it becomes much easier to replace a git source with an earlier snapshot, plus you make use of Nix's native support for locking and caching git fetches, so it's more efficient.

That's what I did for my second branch, which extends the builtins.fetchGit utility:

https://github.com/NixOS/nix/compare/master...Gabriella439:nix:gabriella/fetchGit

… and when you combine those two branches then everything just works1 and any Haskell package that uses a git source automatically does a full rebuild every interval with incremental builds in between.

Not only that, but the new builtins.fetchGit functionality could be conceivably used to power the same feature for other languages, too, so this might pave the way for incremental builds for package managers that are Nix agnostic.


  1. You also have to use GHC 9.4 or newer for reasons covered in the original blog post.



lexi
@lexi

so discord partnered with a new payment company which allows people in germany to pay with a bunch of local payment providers instead of the usual "credit card or you're fucked". and this is not an ad, this might even loose discord a bunch of money. stick with me for a bit. to advertise that, this little guy shows up in your friends menu and the buy nitro screen:

nothing special, right? there's just a slight issue. try to spot it if you want to!



As much as I shit on the Twitter takeover, I actually haven't experienced any service degradation (nor was I really expecting to; they're excellent engineers despite what Elon Musk might say).

The only service degradation I've ever experienced were long-running issues that existed before the takeover. Specifically, the two most common issues I'd notice were:

  • "psych!" notifications

    The notifications tab would briefly display a number to indicate new notifications but then the number would disappear, I'd click over to the tab to see if there were new notifications or not and there were none. Then after some delay (~10-20 seconds) the number would reappear along with the matching notifications.

    I assume that this was due to some sort of eventually consistent architecture on their end, but it would have been a better UX if they would not tease me with the initial brief notification that disappeared and just wait until the notifications were actually ready.

  • Completely missing replies

    If one of my tweets or threads blew up there would sometimes be replies that would just not show up at all in my notifications tab (neither the "All" nor the "Mentions" tab). I even made sure to disable notification filtering and they were still definitely missing. The way I would find out is by manually browsing the replies to the thread and seeing people who were definitely trying to reply to me (i.e. they had not unsubscribed me from the reply).

    I assume that this was some sort of load shedding on Twitter's part on high traffic discussions.