NOTE: this post has been reposted into the MathML era of posting here
One of the more infamous bits of code associated with John Carmack (albeit not invented by him) is the "fast" inverse square root function from Quake III, which computes 1 x in an unconventional way that was faster than floating-point division and square root on old CPUs:
float Q_rsqrt( float number )
{
long i;
float x2, y;
const float threehalfs = 1.5F;
x2 = number * 0.5F;
y = number;
i = * ( long * ) &y; // evil floating point bit level hacking
i = 0x5f3759df - ( i >> 1 ); // what the fuck?
y = * ( float * ) &i;
y = y * ( threehalfs - ( x2 * y * y ) ); // 1st iteration
// y = y * ( threehalfs - ( x2 * y * y ) ); // 2nd iteration, this can be removed
return y;
}
This bit of code is often held up as an artifact of inscrutable genius, incomprehensible to the feeble minds of us sub-Carmack intelligences, but if you can handle high school math, you can understand this function, and you can even use the same technique to write your own "fast" functions in a similar style. Let's walk through how this function works, refactor it to be less mystifying, and then reuse its techniques to implement another "fast" math function in the same style.
@mbr64 has some more fun evaluating inverse square root implementations for contemporary CPUs here: