mcc
@mcc

Did you know this? Apparently if you bought a copy of Super Mario Bros. 3 in the US, depending on when you bought it the "BROS." in the logo is either aligned left on the label or right.

If the logo on your cart is aligned right, you can't get a world record under current speedrun strategies.

And until last year, nobody realized this was a problem. It was believed the versions were functionally identical. The ROMs were known to be different but as far as anyone knew all that changed was typo fixes in text boxes.

In fact, the "Right Bros" carts are slower— by about 26 frames, or a little under half a second, in a full-game run¹.

Now who, you ask, would possibly care about that?

This is MitchFlowerPower. He is— there is little question of this— the best Super Mario Bros. 3 player in the world. Poor Mitch here is trying to make a living in our post-capitalist hellscape entirely by playing Super Mario Bros. 3 and posting about it on the Internet, so in this video he stalls for about ten minutes (right until the point where YouTube normally shows an ad), then does a sponsored pitch for a T-shirt company², before he actually gets to the point. This is why my link above jumps directly to 10:19. Do not blame Mitch. The market makes monsters of us all.

Possibly nothing he says will make sense to you anyway.

I will explain. In warpless SMB3 runs, there's an early (second world) instance of randomness:

Map of Super Mario 3 world 2

At the end of each level, the hammer brothers do a random walk around the map. It's unlikely, but possible, that the HAMMER brother can just book it from the start and get to the left of the MUSIC BOX brother. If HAMMER makes it to the near side of level 3 by the time you beat the fortress, you can use the hammer item to break the rock and thus skip levels 3, Desert, 4, and 5 and save an entire minute. Mitch's Warpless world record margin is just 36 seconds, so clearly getting this randomness is make-or-break.

Grinding to get the random early hammer was brutal even by the standards of the RNG-heavy Mario 3 run, so Mitch started to use a "manipulation". As you might know, true randomness is hard in a computer and so computers use "Pseudo-Random Number Generators". These are algorithms that have non-random but at least very unpredictable outputs, and you "seed" them with some source of real-world entropy— on an old computer like an NES³ this might be something like what buttons the player pressed when, or how many seconds passed between bootup and the player pressing START to begin the game. What's interesting is that although PRNGs produce apparently random numbers, they produce the same random numbers every time, as long as you feed in the same seed. This is the principle that makes TASes possible.

Some speedruns exploit this with "RNG manipulation", where you play the same way every time to get a known random sequence. Some games make this easier than others. In some RPGs you just have to make sure you press buttons in the same order every time, timing irrelevant; in FF7 encounter randomness is governed by only the number of steps Cloud has taken. In Super Mario 3 timing does matter, which means in order to manipulate the hammer brothers Mitch would have to play like a TAS. So Mitch learned to play like a TAS. He would literally run a video of an early-hammer TAS on a monitor and mime along the movements while he played until he matched the TAS exactly.

And it wasn't working.

And this is where Mitch's video above, starting from 10:19, starts to make sense. After weeks of poor results even when he played perfectly Mitch started talking to a reverse engineer and discovered Mitch's copy of the game was the problem. Mitch was playing on a "Right Bros" copy; the TAS was made against a "Left Bros" copy. If you run the TAS on a "Right Bros" ROM, it doesn't work— it actually desyncs and fails to finish the game.

What's the difference?

This is where it gets bizarre. At the end of each level or pipe, SMB3 does some calculations on the player's score, to decide whether to spawn the treasure ship (sum of score digits is a multiple of 11) or the match-two game (score passed a multiple of 80,000). For reasons unknown, this calculation is slightly slower on the newer, "Right Bros" copies. The calculation is so slow it can cause the game to miss a frame deadline. On "Left Bros", if the score digits total to 29 or higher, the game lags by a frame. On "Right Bros", if the score digits total to 23 or higher, the game lags by a frame. Which means if the score digits ever total between 22 and 29 at the end of a level, then "Left Bros" and "Right Bros" desync, the TAS is no longer accurate and the manipulation will not work. Mitch switched to a "Left Bros" copy and two months later got the world record.

Infosec-minded people will have a slow-dawning realization around this point in the post/video that what these speedrunners have haltingly worked their way into uncovering is a DJB-style timing attack performed on the video game. The lag frame ruining Mitch's runs is a sidechannel state leak, of the same sort that leaks encryption key data if your encryption code uses operations, like multiplications, whose runtime varies depending on the input. If Nintendo had (for some reason) used constant-time algorithms in their end-of-level score check arithmetic, the version difference would not have been a problem.

Further elevating the proceedings of this attempt to play a thirty-year-old video game quickly, at the end of his explainer video Mitch says something fascinating:

"I can't take all the credit for figuring this out. I can only take credit for asking the right questions." – Mitchflowerpower

Put that in your philosophy-of-science quote database.


¹ I perform the following mildly-iffy calculation:
# Repeat 10,000 times and average
perl -e '$hittotal = 0; $hitcount = 0; for(1..10000) {
# 82 exits in a warpless run. Source: Mitchflowerpower
$a=0; $hit=0; for (1..82) {
# For each exit pick a random score less than 100% WR final score
$score = int(rand(1209360)); $d = 0;
# Total digits in score
while ($score > 0) { $d += ($score%10); $score = int($score/10); }
# Test for left- and right- bros lag thresholds
$ca = $d < 29; $cb = $d < 23; $cc = $ca != $cb; ("$d: $ca; $cb; $cc;\n");
# Over time count up "Extra" lag frames (lags in right- but not left-)
if ($cc) { $hit++} } ("$hit\n"); $hittotal += $hit; $hitcount++; }
$final = $hittotal/$hitcount; print("$final\n");
# Average extra right-bros lag per game, in frames:'

26.3288
Over a full-game run, per this ballpark estimate, you get a lag frame on about 32% of all level exits. (Of course, they won't be evenly distributed over the run; even Mitch only cares about lag frames before level 2-3, and I assume because your score is lower then it will be easier to get a total under 23 to start with.)

² Actually the t-shirts seem pretty decent.

³ Again I'm covering things here that programmers already know, but slightly more modern computers and video game consoles have internal clocks, which are an adequate source of randomness for games (though not cryptography), and even more modern computers, say the Pentium III, will have actual hardware randomness sources built in. My favorite trick is just measuring the CPU temperature and sampling the twentieth digit. Of course, these entropy sources are typically used only to seed the PRNG, not to actually generate each random number, which means speedrunners playing games for these more modern systems will still do absolutely wild trickery to do black-hat-infosec-style seed derivation and pull off an RNG manip anyway.

⁴ This isn't a real footnote, but incidentally, if anyone complains that above I said you "can't get a world record" in Right Bros when what I literally meant was you "can't get a world record in the Warpless any% category", I will make a face. I will make this face :/

⁵ Cartridge photo at the top is by Reddit user stang189


You must log in to comment.

in reply to @mcc's post:

To be honest, I thought the PilotWings thing was found by chance several years after the fact when emulator devs we're attempting to dump the contents of the DSP1 ROM.

Knowing there was a second revision, one of said devs swapped DSP1 with DSP1B on a lark, let the demo play, and noticed the plane crashed in the DSP1B version.

I didn't know PilotWings actually shipped w/ DSP1B in addition to DSP1.

oh my god that Wind Waker RNG manip. oh my god. that's so... people are so beautiful and so smart and what comes out of that is "this speedrunner is going to do data-entry so quickly and accurately WHILE ALSO PLAYING THE GAME that this program will use the data and timing to get a very, very good estimate of the state of the PRNG of the N64." I am dumbfounded. My reality is clobbered.

Disclosure: The originally posted version of this post had some wonked up math in one point, although everyone seemed to understand what I meant. The math error has now been fixed. In my defense this is all extremely cursed

"For reasons unknown"

One of the video comments from Youtube user Dwedit suggests a shift in page boundaries:

"Now why would a simple Typo fix cause timing differences? Because the "Branch" instructions on the 6502 take one additional cycle if they cross a 256 byte boundary ("page"), such as branching from $80xx to $81xx. If you add or remove bytes in places, or otherwise rearrange the code, then the location of branch instructions changes. Branch instructions that did not use to cross a page will now suddenly cross a page, or vice versa."