• they/them

likes breaking electronics, fiddling with old computers, and making music
https://tommytorty10.gitlab.io/pages/


artemis
@artemis

so imagine a compiler that first

  • traverse the whole tree to collect exported symbols
  • traverses the tree from the root to identify what functions actually get used
  • then gets into the real compiling/type checking/etc. but only spends time compiling the code that will get used in the final executable

does this exist? has anyone done this? is anyone doing this now and I just dont know?


You must log in to comment.

in reply to @artemis's post:

i can think of 2 systems that kind of do this.

the android platform build system traverses the whole tree and collects all the modules, then only builds the modules required by the platform spec (and their dependencies, recursively).

the rust compiler is entirely "demand driven," where each stage past parsing (typeck, borrowck, codegen, etc.) lazily requests the results of the analysis it depends on. it turns out people like to have code that isn't directly used go through typeck/borrowck/etc. though so in practice this doesn't save much time in the rust compiler itself. this does let the compiler skip doing codegen for stuff it doesn't need when building the final binary though which helps.

i don't know for sure but i imagine most other modern compilers work this way too (i.e. they don't codegen functions that aren't called somewhere in the IR)? though the older the more likely it is that they don't, since this sort of design can be a memory hog; think haskell space leaks from insufficiently strict code if you're familiar with that.

deep integration of the compiler and build system would probably let them be more efficient than this but then you kinda get in to weirdness where your "translation unit" (i.e. smallest seperable bit of compilation work) is basically your entire dependency tree, which is hard if you 1) don't have the source to all the modules or 2) want to parallelize across multiple machines. both solvable problems but the requisite work starts to look like reinventing kubernetes or whatever.

Nix is sort of like this. It only interprets code that it actually needs, due to laziness

In other words, if you do something like:

if true then 1 else import ./foo.nix

… then ./foo.nix won't get resolved

(sorry if any of this comes across as condescending, I'm not sure what your knowledge level is on the topic) Link-time optimization does something like this, but only after all the code has been compiled. You can then have a look at the complete program, and drop everything from it that isn't either re-exported or used.

The other thing is that the compiler doesn't normally look at dependencies at all, since they've already been compiled. Almost all of the time when you're using dependencies or writing a program, you're creating a dynamically linked executable. Dependencies are imported as shared libraries, which means they've already been compiled in a format that you can load. During compilation, a linker looks at the dependencies you're loading, and fixes up addresses for symbols. You're not paying a compilation cost for unused symbols in the shared libraries, and the dynamic linker takes care of loading the relevant dependency when you execute your program.

When you're writing a statically linked executable, library code is copied into your executable during linking (so, "static" linking). If you've enabled LTO, you only get code that you're actually using. However, the code you're copying from the dependency has already been compiled, and so you're not paying compilation cost for dependencies in this case as well.

Long story short, in most of the cases, the compiler only spends time compiling your code. Dependencies are either included in a binary format by the linker in case of a static executable (where only stuff you're actually using is included) or alle required symbols are massaged by the linker so the dynamic linker can make them available from a common shared library when the program is executed.

This is only for boring old compiled programs, I don't exactly know how languages like Java or Python do this.

(P.S.: And it doesn't make sense to optimize shared libraries by dropping unused symbols, because any new program might use the symbols you've dropped. Some system like Nix could probably look at all the programs you've installed, and collect a set of symbols needed for the shared library, and only include those, but that seems like a lot of effort. Also, it's very hard to cache, as there'd suddenly be 2^(number of symbols in library) possible versions of each library depending on what symbols you need.)

yeah I know all of that, but that's a very C/C++ view of the world (though I will say, good summary for anyone who doesn't know all of that!)

For something like rust, you're always compiling your entire dependency tree when you compile a project for the first time, and those compilations aren't shared between projects. similar story for haskell, though haskell can share a bit if all your projects are using stack as the build tool, and Stackage (and the same version of it).

I know that for Haskell, some distros (arch, gentoo, maybe NixOS? idk I don't use NixOS) split the dependencies up into shared libraries and separate packages which is pretty neat and helps with compilation times, but it's not normally the world you're living in when you're doing development.

arkflash above says that rust actually does use a demand-driven system for codegen, but idk what level that operates at, like if it still generates all the IR for everything, does optimizations, or what. because rust definitely does start compiling things at the leaves first and takes awhile before it gets to the final compilation units, and most of the time im pretty sure like 90% of that code is things my end program never actually uses.

which, solving this is a hard problem. but also, compilation times suck

though I do wish that cargo would make it easy to share dependencies between packages because then you could actually do distributed rust builds at all, by distributing individual crates. right now you basically can't distribute builds without some extremely sus hacks

arkflash above says that rust actually does use a demand-driven system for codegen, but idk what level that operates at, like if it still generates all the IR for everything, does optimizations, or what.

it is fully demand driven, but it operates at the level of rust's translation unit - crates. so your deps get fully compiled to machine code because each crate's rustc invocation doesn't know what parts of the public interface of the crate are going to be used in the final binary.

the way the web-sys crate uses features is maybe closer to your initial idea? each class in the entire web API surface is a feature flag, and since cargo unions the feature flags across the whole build tree it ends up recompiling web-sys with only the types required by your code and your dependencies.

another thing to think about is maybe the lisp/smalltalk idea of a live set of objects that get manipulated over time as you continue to enter code into a REPL. you could imagine a similar system that keeps the entire compiler state hot in memory as you edit code and does minimal codgen to make incremental builds fast.... at the cost of pretty enormous complexity in the compiler trying to get all the invalidation correct without introducing too many difficult to diagnose bugs.

Completely different technology, but clerk, a Clojure notebook interface does something like this, where it basically builds a dependency tree of variables, and only reevaluates what's necessary whenever the notebook is edited and reevaluated.

I'm pretty sure the general description given above also applies to Rust. Rust just does static linking by default, and you don't generally install your dependencies via the system's package manager, but via cargo. I think the feature flag approach below is probably the closest thing. Alternatively, just put everything into a single crate 😉

I've never gotten into the details of it, but I'm pretty sure some javascript build systems do this, such as esbuild. I think maybe Purescript has some kind of dead-code elimination plugin, too, but I'm not sure at what point in the pipeline it gets run (presumably before code emission!)

i think a lot of things run dead-code elimination towards the end of the build because it's just the easiest way to do it right, as basically an optimization step on the output. dead-code elimination on the input to future steps I think is a lot trickier but thats what im talking about

Oh, interesting. I had assumed that doing DCE in the middle would be easiest since that's when we have the concept of a module. (I suppose if the language has an FFI then it gets sticky)

I've never used it, but I think PureScript Zephyr does what you're looking for. Quote,

zephyr reads corefn json representation[, ...] removes non transitive dependencies[, ...] and generates [...] corefn representation [...] to dce-output directory.

My (cursory!) understanding is that PureScript CoreFn is roughly a typed AST, so this may satisfy your requirements!