Context: this is part of a series of reposts of some of my more popular blog posts. It was originally posted in 2015 here http://tomforsyth1000.github.io/blog.wiki.html#%5B%5BNaNs%20cause%20the%20craziest%20bugs%5D%5D. This version has been edited slightly for clarity
This is the story of a fun bug. The challenge for the reader here is twofold. First, spot the bug before I give the secret away (and the title should have given a huge clue already, and filled those in the know with both dread and giggles). But second, and more pragmatically interesting to working programmers - figure out why this one took me several hours to narrow down and fix, rather than being immediately obvious after a bit of standard single stepping and sprinkling of asserts and all the standard bug triage that usually works pretty quickly.
So the context is some simple image processing. The high-level flow is this:
Read RGB image
Convert to YUV
Do some simple processing
Convert to gammaRGB
Quantise to lower-bit-per-pixel format according to dynamic image range
Dequantise and display to verify it all worked
The result was that the image was kinda right, but the quantisation was very jumpy, and the contrast of the image tended to go up and down unpredictably. I expected it to slowly change as the dynamic range in the visible scene changed - as very bright or dark things came in and out of view - that's what it's meant to do. But what actually happened was even tiny changes in the input caused large changes in the quantisation quality. The first thing I tried was disabling the "simple processing" part so it did nothing (which was the normal state anyway - it was for special effects). Same result.
I then narrowed it down to the part of the quantisation where I scanned the image and found the min and max of each channel. If I didn't do that and just set the min and max values to 0.0 and 1.0 for each channel, everything was stable - though obviously the quantisation isn't giving particularly good results if we do that.
Now as it happens, the code did not do each stage above for the whole image, one step at a time. Instead it did the whole first half of the pipeline for each pixel. So it would step through each pixel, do the YUV conversion and processing, and while there I found the min/max values, i.e.
vec3 max = -FLT_MAX;
vec3 min = FLT_MAX;
for ( int i = 0; i < imagesize; i++ )
{
vec3 pix = image[i];
pix = to_yuv(pix);
pix = processing(pix);
pix = to_rgb(pix);
vec3 gamma = powf ( pix, 1.0f/2.2f );
max = Max ( gamma, max );
min = Min ( gamma, min );
... more stuff ...
}
... use min+max to quantise the image...
(note the use of -FLT_MAX, and not FLT_MIN. That has bitten me before as well. Go look it up. It's not the value you think it is! Use of FLT_MIN is almost always a bug. Maybe go grep your codebase for it just to check. Anyway, that's not the bug here)
Well, debugging a bunch of input images showed pretty sensible values for min and max coming out, so it wasn't horribly broken. But it was still unstable, and further investigation showed the quantisation was having to clamp values outside the range of min...max (that code was in there as a safety net, but shouldn't have been needed in this case). I single-stepped through this loop for a bunch of representative pixels and it all seemed to be working as expected, and yet min and max were clearly not producing the right answers in the end - they were not the minimum and maximum values! How could this possibly be? The code's just not that complex! I started to worry about a memory scribbler playing with the values of min and max on the stack, and other terrifying thoughts.
Then I happened to pick an image where the very first pixel was a fairly primary colour - let's say it was red: (1.0,0.0,0.0). It gets gets converted into YUV, which is some uintuitive triplet of numbers, and then back to RGB. The conversion to and from YUV was not quite perfect, because hey welcome to floating point, and what we got back was more like (0.99997, 0.00001, -0.00001). But that's easily close enough for image work, especially if we're going to quantise the result. But then we convert to gammaRGB by the standard method of raising each number to the power of 1.0/2.2 (similar to sRGB, but not quite the same).
Well there's your problem
Fellow floating-point nerds - spidey-sense tingling yet? Good. So that's the first part - the what - but why did it take so long for me to track this down? What would you expect the end results to have been? Surely I would have spotted that almost immediately, right? So there's clearly more fun to come.
For those not as familiar with floating-point arcana, let me explain. -0.00001 isn't very negative. Only a tiny bit. But that's enough. If you raise a negative number to a fractional power like 1.0/2.2, the result is undefined. powf() gives you back a result called a "NaN" which stands for "not a number" and it's a floating point value that... isn't a number. That is, it's a value that says "you screwed up - we have no idea what this is". Other examples of where they come up is dividing zero by zero (1/0 is infinity, but 0/0 just has no sensible answer). In practice in "normal" code I also regard infinity ("inf") in the same category as NaN. In theory inf is slightly better-behaved, but in practice it rapidly turns into a NaN, since 0.0*inf=NaN. inf-inf=NaN, and so on. They're still usually a sign you have buggy unstable code.
And once you have a NaN, usually they spread through your data like wildfire, because unless you're deliberate about filtering them out, you can't easily get rid of them. 1+Nan=NaN. 0*NaN=NaN, and so on. So once you get one NaN, almost any operation it does with another value will produce a NaN, and then any operations it does will cause NaNs and so on - they "infect" your data.
In one respect this is maddening, but in other ways you do start to rely on it. If you see a NaN in the output of a function, you know that somewhere in that function you screwed up, and there's only a few places you can generate a NaN, so you check them first. Square roots, divides, power functions - things like that. Normal add/sub/mul can't usually make a NaN/inf without first being fed a NaN/inf. And if you don't see a NaN in the result, you're probably fine - because if one ever got generated, they're usually pretty hard to get rid of and they hang around.
And that's what should intuitively have happened here. I took the power of a negative number, got a NaN, oops my bad, and of course taking the min and max of a range of numbers, one of which is junk, is clearly also junk, and then all the stages after that should have produced utter gibberish, and I would have tracked it down really quickly, right? Of course! Except that didn't happen. The min and max results weren't junk - they were totally plausible values for an image - they were just not the correct values.
Wait no THERE is your problem
You hard-core floating-point nerds are all laughing your asses off by now. But if you think you have the whole story - I promise - there's still more subtlety to come! For the rest of my audience - lemme 'splain some more.
I said NaNs don't usually keep quiet and stay where they are in your data - they infect and spread. But there is one very notable exception. Well, I guess two. The functions min() and max(). In fact there are lots and lots of versions of min() and max(), and all do subtly different things in the presence of NaNs. I could write a long and deeply nerdy post on all the varieties because we spent a long time figuring them all out when devising the Larrabee/Knights/AVX512 instruction sets (and at least three of those variants are IEEE754. That's the good thing about standards - there's so many of them!)
The details don't really matter to most coders, just be aware that all standard lessthan/equals/greaterthan comparisons with NaNs return false, except for "not equals" which is true. That is, when you ask the question "is x greater than a NaN" the answer is "no" because we have literally no idea what a NaN is - we don't even know its sign. Additionally, two NaNs of exactly the same bit pattern don't equal each other either, because they're not a "number" - not even a strange one like infinity - they're "don't know".
OK, so the burning question is - how were Min() and Max() implemented? It turns out, in a fairly obvious way, by doing result=(x<y)?x:y. Some people don't like the ?: operator, and I'm not a great fan, but it turns out not to be the problem here, you'd get the same result with traditional if() statements. For ease of discussion, let me expand those macros out, and also change the loop so it's scalar rather than vector, and give it a very small input "image".
float image[5] = { 0.0f, 0.1f, -0.0001f, 1.0f, 0.9f };
float max = -FLT_MAX;
float min = FLT_MAX;
for ( int i = 0; i < 5; i++ )
{
float pix = image[i];
float gamma = powf ( pix, 1.0f/2.2f );
max = ( max > gamma ) ? max : gamma;
min = ( min < gamma ) ? min : gamma;
}
OK, well that's pretty simple - any decent coder should be able to have a go at figuring out the various possible values of min and max it might produce. This example also compiles, and you could run it and print out min and max and try to figure out why you get the results you do. I'll save you the bother - the answer is min=0.953 and max=1.0. How IEEE-nerdy you are should determine how many layers of WTF this should cause.
The first quick scan would expect something like min=-0.0001, max = 1.0.
But I've already pointed out that raising -0.0001f to a power will produce a NaN, so next guess is you'd expect max and min to end up with NaNs, right? Perfectly reasonable guess. But wrong. That would have been far too easy and quick to track down, and I wouldn't have written this post about it.
OK, but I also told you how special Min() and Max() are, and indeed in some definitions they explicitly discard NaNs, giving you the other answer - the one that is a number. So as I said, the floating-point nerds probably thought they had got this solved it long ago. They'd expect the NaNs to be ignored, and so the expected result there is min=0.0, max=1.0. And that would have been a very subtle bug as well, but probably so subtle we'd have never even seen it, because that is really really close to the actual range of those numbers, and this being image-processing (especially with quantisation), close enough is just fine.
Ah, but this isn't IEE754-2008 Min() and Max(), this is just ?: And they do awesome things in the face of NaNs. Let's single-step.
After the first loop, min=0.0, max=0.0. Easy, no problems.
After the second loop, min=0.0, max=0.351 (which is the surprisingly large result of gamma-correcting 0.1). So far so good.
Third loop, gamma=NaN, and here's the obvious problem.
max = ( max > gamma ) ? max : gamma;
min = ( min < gamma ) ? min : gamma;
Is (0.351 > NaN)? No, it is not. Comparisons with NaN return false. So the new value of max is gamma, i.e. max=NaN. Well, that's what we were expecting alright. Our data is now screwed. Same thing happens with min and max - they're both NaN, the world is a disaster, game over, the output is junk.
Er... isn't that your problem?
But wait, we know the final answer wasn't junk NaNs - it was plausible. Eh? Somewhere, the NaNs vanished again. Let's do the next loop. So now gamma = 1.0, min&max=NaN, and again:
max = ( max > gamma ) ? max : gamma;
min = ( min < gamma ) ? min : gamma;
Is (NaN > 1.0)? Again, it is not - comparisons with NaN return false. So the new value of max is... oh, it's actually 1.0. The NaN vanished! And again, same with min. So now they're both 1.0.
And last loop is as you'd expect, max=1.0, min=0.953 (the result of gamma-correcting 0.9)
So if you think through the overall behaviour, the effect of the specific way this was coded was that NaNs were discarded (as long as the very last pixel wasn't a NaN-producer) and we did get the min/max of a range of values - but only the range of values after the last NaN. How subtle is that? And the NaNs were due to slight math wobbles on certain colours during RGB->YUV->RGB conversion, so they'd pop up in random places and come and go and unpredictably, and every time they did an entire chunk of the image suddenly got ignored or added for the purpose of min/max and the quantisation and contrast would pop. This finally explains everything!
It's worth pointing out that this only happened because of the specific way min() and max() happened to be implemented in the library I was using. It is left as an exercise for the reader to consider what the results would have been for the equally valid, and you'd think completely identical:
max = ( gamma > max ) ? gamma : max;
min = ( gamma < min ) ? gamma : min;
or a number of other possible alternatives such as using the SSE min/max assembly intrinsics.
Oh, and the actual bugfix - the solution to all this? Very simple - just clamp the result of the YUV->RGB conversion to 0.0 to 1.0, and a nice chunky comment block about WHY.