• he/him

Avatar by @DrDubz.
Banner by one of Colin Jackson, Rick Lodge, Steve Noake, or David Severn from Bubsy in: Fractured Furry Tales for the Atari Jaguar.


SLIMEPATTERN
@SLIMEPATTERN

This is the story of one of the most headache inducing bugs i've encountered since i was very young. Early on in the 90s i remember encountering mysterious bugs in my C and perl and python programs that i could never comprehend. I've looked back on those times and wondered how many times it was just me, and how many times it was an issue with the tools of the era. I usually come to the conclusion: it was me. but having this type of experience really makes me wonder.


So maybe 5 or 6 years ago i was working on a randomizer mod for Deus Ex (unrelated to the one someone else made that actually got released). One of the first things you would want for a randomizer is a pseudo-random number generator (PRNG) that you can seed. This is important for testing the randomizer, as well as for letting 2 different players play using the same randomizer seed and comparing their experience. Importantly, this seed should also be different than the seed the game uses for other random events like gunshot spread, and so on, because you want different players' randomizer effects to be the same when they enter Level 2 or whatever.

Deus Ex uses Unreal Engine 1, forked off a beta version of Unreal Tournament 99. Like all the Unreal Engine games pre-UE4 (with a handful of exceptions like Invisible War), DX is mostly written in a mix of C++ and epic's inhouse language, UnrealScript. So, to be clear, UnrealScript was in use in serious AAA projects between approximately 1995 and 2015.

UnrealScript is a lot like java. It's very object oriented in philosophy, and the syntax and naming schemes are extremely similar. It's also its own thing in a lot of ways, like being case insensitive and so on, which is really "cool" when you use UpperCamelCase to identify types/methods and lowercase to identify local variables.

So, i needed a random number generator, so i peeked into object.uc. This object.uc file contains the base Object class from which all other classes are derived, as well as a bunch of global functions. it even contains operator definitions:

native(146) static final operator(20) int  +  ( int A, int B );

Like this one, which defines integer addition. native just means it is implemented in C++, so there's no function body here. This line of code happens to be the exact same in Deus Ex, UT99, and UT3, among games I checked.

So anyway, I found that object.uc just provides a generic global Rand() function and no way for me to set or get the seed. So obviously, I went with the next best thing: implementing my own PRNG. This is one of those things that sounds scary, but honestly it isn't so bad, normally. You don't need to actually understand the internals of these things. If you are doing something involving security, you probably oughta know a bit more about it, but for a video game, just go for it. There's plenty of nice off-the-shelf algorithms. Look what Doom does. I tend to go with one of the variants of XorShift, because it's very simple, easy to implement, pretty good, and real fast.

In this case I took a few minutes and implemented XorShift-32 in UnrealScript

// xorshift32
function final int xorshift_next()
{
	seedX = seedX ^ (seedX << 13);
	seedX = seedX ^ (seedX >> 17);
	seedX = seedX ^ (seedX << 5);
	return seedX;
}

And I tested it out with Player.ClientMessage(xorshift_next());, printing me a bunch of nice randomish looking numbers up top where Deus Ex gives you those little messages about skill points or whatever. Sure, I didn't really truly determine their randomness, but we're not cryptographers, this is a video game, it should be fine, right?

So xorshift32 gives you a 32-bit integer, which can range from -2147483647 to 2147483648. Kind of excessive. So you need some way to take a random number that can be in a big range and turn it into a small range. So I did this:

// supposed to returns a pseudorandom result in the range of [0, max)
function final int xrint(int max)
{
    return xorshift_next() % max;
}

The % here is the modulo operator, which just gives you the remainder of integer division. A remainder is always going to be within your nice friendly range of 0 to your desired maximum, so it's a good way to turn a really huge number into a small one.

So, fast forward some, everything seemed mostly fine, I was using these random numbers and things seemed random for a while. But then I started to notice that things felt a little off, like I kept getting similar results? What the heck?

So I grabbed one of my recent lines of code and I set it to print out the random numbers. Player.ClientMessage(xrint(12)); And I got a sequence that looked something like this:

0
0
4
0
4
8
0
8
0

Still, "random", just only randomly 0s and 4s and 8s. What the fuck? I was staring at this in deep confusion. Every so often, very rarely, I would get a 2 or something.

Okay, so um, I tried poking around at my implementation of xorshift32 but everything seemed fine there. I tried switching to the XorWow algorithm. I think Eniko helped me with the constants I used to initialize it, I forget.

// thanks to Eniko
// this is the xorwow algorithm by George Marsaglia
function final int xorshift_next()
{
    local int s;
    local int t;

    t = seedW;
	t = t ^ (t >> 2);
	t = t ^ (t << 1);
	seedW = seedZ; seedZ = seedY; s = seedX; seedY = seedX;
	t = t ^ s;
	t = t ^ (s << 4);
	seedX = t;

    seedI += 362437;

	return t + (seedI);
}

But no, we were still stuck in 084084-world. What the fuck? I was stuck wondering if the bitwise operators (<< and ^) are broken in UnrealScript? They're not as commonly used in that language. But I tested them out and everything seemed fine. What was wrong??

I tried printing out Player.ClientMessages at every step of the way and the numbers seemed right until the very end, after the % modulo. There was something going wrong either with the way the function was returning the result, or with the modulo. Both seemed incredibly unlikely. People shipped so many huge AAA games for more than a decade using that % operator, used everywhere in loops and all sorts of other things.

I tried maybe to debug things:

// supposed to returns a pseudorandom result in the range of [0, max)
function final int xrint(int max)
{
    local int result = xorshift_next() % max;
    Player.ClientMessage(result);
    return result ;
}

And I still got good old 0, 4, 8, blah blah

I tried out the % operator with a bunch of hand-chosen numbers and got right answers with all of them. I tried out running a loop for a few hundred numbers starting at 0 and got good looking results of the %. What was wrong??

At some point I had changed my function to this

// supposed to returns a pseudorandom result in the range of [0, max)
function final int xrint(int max)
{
    Player.ClientMessage(xorshift_next() % max);
    return xorshift_next() % max;
}

I mean, maybe there's something wrong with the compiler and how it's assigning the value, or something? What could it be? But no, it's back to

8.000000
0.000000
0.000000
4.000000

this shit again. I tried switching PRNG algorithms again because I had been it for an hour at this point but no luck. Then. Wait. Wait a fucking minute. It doesn't say 0, 4, 8 anymore, it says 0.000000, 4.000000, 8.000000. It's in floating point somehow? But how? i'm taking the result of xorshift_next() (integer) % by max (integer)? What?

My initial assumption was that ClientMessage is doing something weird. I check the code for ClientMessage (defined in pawn.uc) and there doesn't seem to be anything too obvious but it has the following signature:

event ClientMessage( coerce string S, optional Name Type, optional bool bBeep )

So coerce means "automatically convert the type for me from what type I was getting". So it implies to me, that yes, somehow ClientMessage is getting a float.

So I look back in object.uc at the operators. I need to double check that I haven't lost touch with reality here.

native(143) static final preoperator  int  -  ( int A );
native(144) static final operator(16) int  *  ( int A, int B );
native(145) static final operator(16) int  /  ( int A, int B );
native(146) static final operator(20) int  +  ( int A, int B );
native(147) static final operator(20) int  -  ( int A, int B );
native(148) static final operator(22) int  << ( int A, int B );

What? it's not here?

I ctrl-F for %.

native(173) static final operator(18) float %  ( float A, float B );

and all I find is this. There is no integer modulo operator in UnrealScript. There is no integer modulo operator in UnrealScript. There is no integer modulo operator in UnrealScript.

And so, if you take two integers, you modulo them, it silently casts them to floats, and then silently casts them back to integers.

Now, why does this matter? Lol, lmao? I at least knew this. Have you heard of floating point imprecision? Plenty of people write code and they unfortunately aren't aware of this. There are much better explanations elsewhere, but the long story short is that the farther away you get from 0.0 in either direction, the less precise floating point numbers get. Meaning, if you took a random integer from -2147483647 to 2147483648 and converted to a floating point number you would potentially get rounded off (in base-2). Hence: 0, 4, 8, and so on.

This couldn't be real, could it? People spent millions of dollars and years developing code with UnrealScript. Thousands of community modders used UnrealScript.

I checked: Unreal Engine 2: no integer modulo. Unreal Engine 3: no integer modulo. Integer modulo is used all over UnrealScript code for all sorts of things, especially loop conditions.

Did nobody else notice?

Did it never come up? How could it never come up?

How many horrifying heisenbugs did this cause for people?

I guess if you access out of bounds on an UnrealScript array the game just keeps running and fills your logfile up with "Accessed none"? So, it's maybe hard to notice sometimes when or why things go wrong?

I can just easily imagine some poor gameplay programmer in the year 1999 on crunch time tearing out their hair over this. It's just. Strange.

It's the kind of bug that makes think back to those times, as a kid, where I wrote code that mysteriously didn't work for no explainable reason. It just would seem like the computer was alive and refusing to work with you.

Now Playing: Dan "Basehead" Gardopée - Naval Base (from Deus Ex)


You must log in to comment.

in reply to @SLIMEPATTERN's post:

oh hell, that's really bad. i wonder if any of the UE1 or UE2 games i worked on encountered this (i wasn't really touching the code on those so i dunno). or maybe they did, and just patched it internally and didn't tell anyone (seemingly quite common licensee behavior during those eras)