the-doomed-posts-of-muteKi

I'm the hedgehog masque replica guy

嘘だらけ塗ったチョースト


twitter, if you must
twitter.com/the_damn_muteKi

haloopdy
@haloopdy

this may be a preamble to future posts. we'll see. the above is a software renderer i've been working on which targets very old systems, just for fun

tl;dr: if your language/compiler uses llvm, you are most likely not supporting 32 bit processors properly (if at all), and i don't blame you for that.

my boyfriend is really into using older hardware, and he's often frustrated at the decisions made by developers. at first i was skeptical; i'd understand the frustration if the hardware was like 8 years old and already out of support (as is the case with windows and macos), but he's going on about hardware that's 20+ years old. his daily driver is 16 years old and even that is still firmly supported by linux and (mostly) supported by most distros. in the past, i'd try to point that out to him, that it's really impressive that his 16 year old hardware is so widely supported. but he focuses on his pentium 3 from 1999 and the problems that plague the linux ecosystem. i didn't get it.

then i started using that same old hardware. i'm starting to get it a bit more.

i wanted to talk a bit about some of the things i discovered over this rather long journey. i'm sure a lot of it is well known, but who knows.


sse2

so the biggest (and sometimes only) issue seems to be the feature-set of processors, specifically sse2 support. sse2 is a set of extensions to x86 which give you hardware SIMD (single instruction, multiple data). this means you get faster floats and faster math in general. this can also mean faster loops, which helps critical sections of code.

there's a lot of history around sse, simd, and the handling of floating point calculations over the years, but i don't want to go over that here. maybe someday.

sse2 is a common baseline feature-set for code, because all 64 bit x86 processors have sse2. as such, if you're compiling a 64 bit program, your compiler will automatically enable optimizations that take advantage of sse2. these can make your program run significantly faster, depending on what it's doing. even programs that aren't doing any floating point math still get performance benefits from sse2.

the problem arises from the muddy waters of 32 bit processors. don't get me wrong, there's an insane amount of extensions to x86 for 64 bit processors. but everyone is still using 64 bit processors and so people are more careful about which extensions they require, depending on the application. you often have to compile software with special flags to take advantage of those newer features so that general builds are more compatible.

that level of care doesn't seem to extend to 32 bit processors, and it's tough to place all the blame on only one thing. the situation seems very complex, and multiple issues have been brought up on compilers not getting it right and refusing to fix it. so let's talk about that

compilers (llvm)

just a quick preface that i don't hate llvm or rust and i understand that they're faced with tough decisions about decades old hardware and what's best for everyone now that mistakes were already made.

anyway, there's two big players: llvm and gcc. if you're compiling c++, rust, swift, zig, etc, you're almost certainly using llvm. gcc is still the go-to compiler for pure c, though clang (an llvm-based compiler) can create better binaries depending on the situation. other languages create their own compilers; we'll get to that later.

llvm is great; it gives you tons of optimizations and tons of supported architectures, as long as you write a frontend for your language (which is a task orders of magnitude simpler than writing the full compiler). the downside is that you also get all the baggage from llvm.

that baggage is a complex topic i won't pretend to fully understand, so hopefully my summary is at least somewhat correct. basically, people can't seem to agree on what a "target architecture" actually means when you get into 32 bit processors. the blame can somewhat be placed on intel and amd, who of course just did whatever branding shenanigans they could to make money. but, they also do have a relatively clear-set delineation for processors of a given "generation" (mostly), and llvm doesn't seem to have cared since... forever?

the point

now we get to the point, the thing that is just actually broken. there are many 32 bit processors, and they are broadly categorized by their architecture "generation". these are things like i386, i486, i586, and i686. each one adds new instructions that are incompatible with the previous generation, making this all rather complicated to support. obviously older compilers which lived through it all, such as gcc, support the nuances "just fine". what does llvm do? it says they're all pentium 4.

what is a pentium 4? it's a processor released in late 2000 that replaces i686. you might see how this is a problem. the target for essentially all 32 bit processors is one that was released late into the game. it's not just they set their target high, it's that selecting i386, i486, i586, and i686 will all select pentium 4, even though it is none of those.

so, how many 32 bit processors were released after the pentium 4? oh, about 7.

this is bad, and it's been this way for so long that even though it has been brought to people's attention, it's not something they can fix without breaking old builds that rely on this (though, i would argue that's a moot point since there's so few processors where the stars align with this messed up system).

the "good news" is that you can send a bunch of flags to llvm to remedy this situation by selectively disabling features that weren't available in those architectures, but (a) that's not the default behavior and (b) it's up to you to get it all right for each one. llvm doesn't do it for you.

rust tried to do the right thing and set all the flags appropriately, so i386, i486, and i586 are "mostly correct". the problem is that their i686 target still enables sse2, and if you've been following along, sse2 is only supported by a very small percentage of all i686 processors.

how it manifests

if your language uses a compiler based on llvm and it supports 32 bit builds at all, chances are they're actually broken. zig, for instance, had an i386 mode which just relied on llvm to get it right. we now know llvm does not get it right, and zig figured that out and renamed it x86. thus, their so-called 32 bit build option only supports that small handful of 32 bit processors which have sse2, so most people who actually want a proper 32 bit build will get an "illegal instruction" error upon running your zig binary. just an fyi

the worse problem is rust, as it is being adopted in more and more core utilities in the linux ecosystem. this is good, because in theory it should make things safer and easier to maintain. the problem is that, unless distros know about this whole debacle (and they don't; i wanted to link debian 32 bit builds being broken but i can't find them right now), they will be building so-called 'i686' packages that SHOULD target a wide range of 32 bit platforms, but which actually don't work on most i686 processors. they would confusingly need to pick i586 when compiling rust apps for their i686 build in order to sidestep the problem (see "refusing to fix it" link above compiler section).

worse still is that the philosophy of "just build it yourself" is nearly impossible on these older platforms now that rust is required. it's tough enough waiting on c code to compile, but rust often just crashes, if it's even supported at all. then after waiting days and days, you try to run it and you just get "illegal instruction". oops, you're supposed to know that rustc is incorrectly configured for your target platform and cross-compile for i586 instead.

does it matter?

i don't know. i see videos of people getting old core i-series desktops from the trash (in the US at least), and i'm constantly being told by friends about the various laptops from the late 2000s that are just being thrown away that you can pick up. this lines up with my own experiences, where i was able to walk in and pick up a perfectly good core 2 duo system for free, which i now use all the time. since it's a 64 bit system, none of the problems i explained earlier plague it, and i have had no issues installing any manner of pre-compiled binaries. i even ran minecraft on it!

so it seems like, if you can't afford a pc, even the free trashed stuff will be far newer than all this old junk that is clearly not supported by modern compilers (llvm) and you won't have a problem (if you're able to setup linux, that is).

but not everyone lives in the united states. not every country is so rich and wasteful that they can just throw away perfectly working technology. i live in the US, so i can't speak on behalf of anyone else, but i know that it's often the case that the people who are most affected have the smallest voice, if they have any say at all. so as well-meaning as the rust team is, or any dev team for that matter, they won't even know that their decisions to drop support or leave things broken will negatively impact tons of people who rely on core utilities working for hardware they still actually use. but again, i can't speak on their behalf because i just don't know.

how it impacts me, specifically

ok now i want to talk about some fun/interesting things. i've been trying out various languages over the past couple years trying to get a feel for broader programming quirks (like these) and how different languages might solve them. lately i've been getting back into c, which of course just works on everything (after, of course, enabling every warning known to mankind through esoteric warning flags and STILL getting build errors on other systems because c is as much of a mess as hardware is). but c really sucks, so what are my alternatives?

for various reasons, i can't use rust, even though i'm pretty familiar with it. my boyfriend has rubbed off on me and now i ALSO use 16 year old hardware to program on (it's really cozy) and rust compilation can take 15 minutes for anything more complicated than a small 1000 line program (unless you write everything yourself and bring in as few dependencies as possible, in which case it'll still take 7 minutes). plus, i somehow don't trust that the compiler will emit the right instructions now that i know i686 doesn't actually mean i686, and that it's built on a compiler backend (llvm) which gets it 10 times more wrong than rust does. rust has to do a lot of work to get around the jank, and who knows when that'll break as time goes on.

i wanted to try zig; lots of people have been suggesting it. but the nearly 0 support for 32 bit builds makes it a non-starter, unfortunately. i want my builds to run on a pentium 3 at the very least, and zig technically only supports 8 32 bit processors (which doesn't include the pentium 3).

i even wanted to get back into c++. i've been using it a bunch for arduboy games, and as long as you don't do anything crazy with it, it's nice (compared to c). but the c++ compiler is basically just clang, and clang is llvm, and blah blah blah you get it.

sooo i guess c with gcc it is...
wait...

go??

i recently picked up go and have been using it for all my projects. i actually really like it, despite the flaws. but it doesn't FEEL like a low level systems language; it has a garbage collector, it has runtime assertions, a giant standard library, crazy and cool async concepts built in, etc. and yet, out of all the languages i've mentioned (other than c), it's the only one with a custom-written compiler. it's not based on llvm, so it doesn't bring in its problems.

even so, i was surprised to find that go can compile binaries with software floating point operations, meaning (in theory) it actually can create binaries which run on basically all 32 bit processors, including the ancient i386. my boyfriend was able to run my go app compiled with software floats on his pentium 3 with no issues whatsoever.

for those interested, the things to set are GOARCH and GO386, like so:

GOARCH=386 GO386=softfloat go build <etc>

floating points, like i hinted at before, are notoriously difficult to handle for compilers, mostly because of all the changes hardware went through in order to support the complicated hack that is IEEE 754. emulating them in software is of course not an option if you have high performance needs, but most "regular" tools like grep, etc don't really need float calculations, or if they do they don't need them to be fast. software floats aren't even that slow, relatively speaking. we're talking hundreds of cycles, not thousands or millions. that adds up big time for science and graphics but doesn't matter at all for most tools.

final thoughts

i'm aware that there are other modern languages that would probably work for my needs (such as nim), and i'm probably going to try them out in the future. i just wanted to bring some of these issues up and thought it was interesting that a language i associate more with python than "systems languages" (go) ended up having the widest range of support. it's also extremely fast, nearly as fast as c. i've recently been learning software rendering and wrote my first trials in go, and was able to achieve 60fps with simple scenes. rewriting it in the fastest c code i could muster, i was only able to double it to 120fps (see gif up top). turns out go is a pretty dang fast language. software rendering is WAY more complicated and taxing than any normal app i'd write, so i'd say go produces very fast binaries.

it's unfortunate that all these things that underpin modern technology have so little care for older hardware, but i can't say i blame them entirely. the hardware landscape is an absolute mess; the drive for profits means we got some new cpu extension every year, outmoding old hardware at a rapid pace and creating a landscape that is REALLY hard to account for. developers don't have infinite time and these are all open source projects that are either made for free or which are supported by those very same companies who aren't going to pay for developers to spend tons of time making everything work on 20 year old hardware, even if they wanted to. which i guess makes it kind of surprising that the google-backed language was able to make it work.

i don't know if a single person reading this is actually impacted at all by any of these issues. even my boyfriend knows that this is a hobby for him; he has newer hardware lying around which he could use, he just doesn't want to. so from our privileged positions, it's truly hard to say what's right or wrong.

if you read this far, thank you so much for taking the time to read my thoughts. i hope you learned something, or were at least entertained. i hope to write up some more about my adventures with software rendering; it's a really fascinating topic that i'd really like to talk about.


You must log in to comment.

in reply to @haloopdy's post:

nice writeup! yeah i've been dabbling in homebrew for the 386-based FM TOWNS and i hit the same wall with llvm-based compilers as you did, so i've stuck with gcc. i noticed though that even gcc has some pretty weird regressions in its tuning for older intel CPUs that, while still producing correct code, leads to far-from-optimal codegen choices:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115690

another reason that modern compiler backends prefer to assume sse2 as a baseline is that x87 floating-point is…extremely weird to generate code for. the x87 register file acts halfway like a stack, halfway like a normal register file, in a way that may've made sense for assembly hackers and single-pass compilers in 1981 but which is tricky for modern SSA-based register allocation algorithms to generate good code for

EDIT: oh jeez i forgot to say: thank you!!

oh yeah, i'm aware of the madness of x87. i mentioned it in passing in the post (through a link only) but yeah that's a whole topic in and of itself lol. it's most likely the reason why go's only support for non-sse2 32 bit processors is through the use of software floats, which imo is a good alternative for most apps if you don't want to deal with x87 (i certainly wouldn't)...

and wow; i'm both surprised and not by the gcc thing. i'm sure all that code was last touched absolutely ages ago and nobody knows (a) what it does now and (b) whether it even works. so i guess it's surprising it still does? lol

This is honestly the first thing I've read about Golang that I actually respected. That's pretty rad.

I feel the pain. I hate C/C++ but I love old hardware and would love to code more for it, but finding the bridge between old hardware support and a language I can actually stand to code as an FP-poisoned weirdo is just ... kinda not doable. Even when I do find stuff that kinda works ... it really doesn't, and I get so exhausted from the effort to even get the damn compiler/editor/etc. working, that I give up before actually trying to build anything.

I have gone down some truly weird blind alleys in the process too, from Forth to Visual Basic to combing through old archives to find early versions of Scheme that ran on Amigas ... the hunt ever goes on. Part of me thinks it'd be great to see a "modern-ish" lang specifically targeted at older hardware, and wishes I had the compiler dev chops to make it happen ... but first I'd have to find something that already exists for that hardware that I'd like coding in ... and now we're back where we started.

yeah... i know a lot of the other design decisions are questionable and people don't like the attitude of the creators (which also put me off the language for a while), but i've been enjoying everything i've written in go.

aw i feel that exhaustion, and i've definitely been there many times where i just give up after spending all my effort on "setup".

oh gosh, visual basic. idk, i guess it's usable but the com stuffffff i feel like i'm lost in some horrible bureaucratic nightmare, like i have to go through several layers of management before i can go "please microsoft, run my binary".

as for the modernish language targeting old hardware, i feel like that's what rust was supposed to be, considering all the work they put into supporting old hardware. and to be fair, you CAN compile rust for all the old stuff i mentioned, you just have to sidestep the potholes. and, you can even build binaries with softfloats just like you can in go. the problem with rust ofc is that you can't program it ON the old hardware itself, you have to compile it on some fancy modern machine with a million cores and then transfer it to the old hardware. idk, i just didn't want to do that, so i get not having rust as an option.

I found actually that there was something a little relaxing about making little toy apps like dice rollers in modern BASICS like VB or Gambas. I even discovered that there is a fricking GUI library for QB64 now.

Something about hearkening back to the "not giving a fuck" era of my coding life. I have no doubt that coding in anger could quickly get miserable and unmaintainable, but for little utilities and such it was fun to just not worry too much about "correctness" and make a little toy.

Plus I have always wanted to learn more non-web GUI programming, but the experience usually involved interacting with horrifically complicated C++ APIs from hell, and ... no thanks.

i have indeed myself been affected, my poor pentium 3 also suffers. and i have spent quite a bit of personal effort pushing against the current state of things, but at at some point it became too much effort compared to just using different (older) software. i wish things were not this way

there are dozens of us!

im sorry you've also been affected. i totally understand not wanting to keep pushing against this; i had no idea any of this was an issue until a year ago, and i think that's where most people are. so not only is it tough to get anyone to care, it's tough to get anyone who understands and will back you up.

This is a very interesting writeup and a fun topic- once again a bit of insight into FOSS devs' 'priorities' and the everpresent, heavy albatross that is x86 as a whole. but we love it, don't we folks

There's definitely room for improvement but I appreciate you being nuanced- I've seen some truly dire computer labs in various countries, but I can't imagine many production or education environments running into these issues with how cheap and available x64 hardware is.

yeah, that's why it's so tough. we could complain about the state of things but linux support is astounding relative to the offerings from big tech. netbsd is one of the only things with arguably broader support.

it's really frustrating when open source software just does such a bad job supporting older hardware. for some reason i expected older hardware to just be super well supported because you can just keep your old software that worked on it?? maybe? unfortunately thats not how real life works lol.

makes me want to make a programming language to target super broad support on older hardware and fast enough compile times/ram usage that it actually works (like someone mentioned in another comment). was trying to compile haskell programs on a system with 1 gb ram and ofc it crashes just trying to install it.