I was working on getting Facebook's Sapling source control management system (mercurial, but half rewritten in rust and with the repo data structures switched out) to build for macOS[1] and ran into some issues with link errors that would be resolved by using the correct linker options for building Python extensions.
[1]: I don't like facebook but they released something which might actually work on large repos, which is good because Git loves its O(reposize) operations
One small absurdity with this: the developers of this software support macOS; it should already work, so I am undoubtedly holding it wrong. Indeed, grepping for the relevant flags -Wl,-undefined,dynamic_lookup, they are present in the repository so something is broken in the Nix build.
Time to get our hands dirty.
Nix allows you to build things manually inside a nix-shell but it's varying levels of broken for any given derivation because it is not perfectly the same environment as an automated build.
I've written a blog post with more details of how to do this.
So, nix-shell -A sapling, whatever.
I got some error about doing something in /tmp when it should have used /tmp/sapling so I looked in the environment. Looks like something set NIX_BUILD_TOP and TMPDIR to /tmp.
Fixed it:
export NIX_BUILD_TOP=/tmp/sapling
export TMPDIR=/tmp/sapling
Then I can do genericBuild and get it to fail.
My link issue reproduces with python setup.py build, and indeed even cargo build, thankfully, so now I'm only two build systems deep in hell.
I built the problem crate with -v and observed that the rustc invocation was wrong. Cool. So why are there no flags?
I grepped the sources for the right linker flags and observed that there's a thing that creates a .cargo/config with the correct ones, but why is rustc not getting them?
I confirmed that .cargo/config is actually there, but it seems that the rustflags set in it are not being respected. Why?
I Googled "cargo print config" then read the docs:
Aah there is a command cargo config but so they left it unstable like many other debugging commands in the rust toolchain. Unfortunate. Of course, as these things go, stuff is broken in the middle of three separate build systems in such a way it would be wildly impractical to change it out for an unstable version.
Since the stability policy is inconvenient, let's annoy every rustc developer I know by turning that stuff off with the not-so-secret secret environment variable:
[nix-shell:/tmp/sapling/source/eden/scm]$ RUSTC_BOOTSTRAP=1 cargo config -Z unstable-options get --show-origin
host.linker = "/nix/store/v069wkyb6y6qw2bhpzag5kka19j7clgg-clang-wrapper-11.1.0/bin/cc" # /private/tmp/
sapling/.cargo/config
host.rustflags = [
"-C", # /private/tmp/sapling/.cargo/config
"target-feature=-crt-static", # /private/tmp/sapling/.cargo/config
]
# <SNIP>
target.aarch64-apple-darwin.linker = "/nix/store/v069wkyb6y6qw2bhpzag5kka19j7clgg-clang-wrapper-11.1.0/bin/cc" # /private/tmp/sapling/.cargo/config
target.aarch64-apple-darwin.rustflags = [
"-C", # /private/tmp/sapling/source/eden/scm/.cargo/config
"link-args=-Wl,-undefined,dynamic_lookup", # /private/tmp/sapling/source/eden/scm/.cargo/config
"-C", # /private/tmp/sapling/.cargo/config
"target-feature=-crt-static", # /private/tmp/sapling/.cargo/config
]
# <SNIP>
unstable.host-config = true # /private/tmp/sapling/.cargo/config
unstable.target-applies-to-host = true # /private/tmp/sapling/.cargo/config
# The following environment variables may affect the loaded values.
# CARGO_HTTP_CAINFO=/nix/store/kqivfw6rxa47264yiidxlg8sa347ysf1-nss-cacert-3.83/etc/ssl/certs/ca-bundle.crt
Looks like the Nix Rust builder is setting some configs that looks suspiciously relevant.
Move that file out of the way and delete those weird unstable lines, maybe?
$ cp /tmp/sapling/.cargo/config{,.bak}
$ $VISUAL /tmp/sapling/.cargo/config
... and it builds. As they say, "well there's your problem"! Time for grep.
Looks like it's set from pkgs/build-support/rust/hooks/default.nix and pkgs/build-support/rust/hooks/cargo-setup-hook.sh in particular.
How can we fix it? Let's look at it in a repl a bit:
$ nix repl
Welcome to Nix 2.11.0. Type :? for help.
nix-repl> n = import ./. {}
nix-repl> n.rustPlatform.cargoSetupHook.cargoConfig
"[host]\n\"linker\" = \"/nix/store/dq0xwmsk1g0i2ayg6pb7y87na2knzylh-gcc-wrapper-11.3.0
<snip>
nix-repl> n.rustPlatform.cargoSetupHook.override<TAB>
n.rustPlatform.cargoSetupHook.override
n.rustPlatform.cargoSetupHook.overrideAttrs
n.rustPlatform.cargoSetupHook.overrideDerivation
Hm. overrideAttrs generally can just replace anything in the attributes of the derivation, so probably that could also include the cargo config stuff. Let's see:
nix-repl> (n.rustPlatform.cargoSetupHook.overrideAttrs (_: { cargoConfig = "nya kitten brain nya ny
a"; })).cargoConfig
"nya kitten brain nya nya"
Seems about right. So we can just make our own Cargo setup hook that deletes the unhelpful parts of that config template (or even the whole thing; it's not load bearing).
Postscript
This change was submitted to nixpkgs and merged. It turns out there was more fun debugging, because Sapling was starting a background daemon, which was invisibly segfaulting. I found out, thanks to a coworker suggesting it, that it works fine on Python 3.8 rather than 3.10, as that's what the official binary distribution is, so I switched it out and it all started working properly.