• he/him

programming, video games, dadding. I happen to work for Xbox, but don't represent them.


more blog
dev.to/knutaf
discord
knutaf

Yesterday I uncovered a bug I'd made a while back. I must have changed some timing, so it suddenly started happening for me.

We have some code that reads from a HID device using ReadFile with overlapped IO. I declare the overlapped object on the stack, issue the ReadFile, and call GetOverlappedResult with a timeout for how long I can tolerate waiting.

If the IO doesn't complete in that timeout, I was just returning and moving on with my life

ERROR ERROR YOU CAN'T DO THAT

the IO will complete at some later time and the kernel will happily just write into the OVERLAPPED structure you'd previously given it. in my case, randomly corrupting the stack

the most confusing part was i caught the repro under time travel tracing and was absolutely baffled by:

  1. i set a memory write breakpoint in the debugger and it didn't trip when going through the trace, and
  2. i could step over a single instruction and see the memory change from good to bad, but the instruction was like writing to some register or taking a branch or something. definitely not writing to memory

a coworker with clearly more experience with this immediately guessed that it was an errant IO completion, then took an entire 2.5 seconds looking at the code, pointed, and said "you can't do that"


You must log in to comment.