Moo

lesbrarian goat gal

Online, I do a little bit of art and a little bit of web design. Offline, I'm a children's librarian!
Art credit: pfp
No kids, no racists, etc.


Feed so it's in the data export
mooeena.bearblog.dev/

pancelor
@pancelor

this post now lives here: https://pancelor.bearblog.dev/adventures-in-pico-8-compression-mine1k/

minesweeper

I've been having a lot of fun recently with pico8 sizecoding, and the pico1k jam in particular. the jam rules are "make a game/thing in 1024 bytes or less". last year this specifically meant 1024 characters of code, and I made a free cell clone. it was so much fun! the jam feels like an open-ended zachtronics game. you set your own difficulty level (how complicated the game you're trying to make is), and then you slowly invent more and more lua crimes to make the number go down until you win! typing out return is too many characters -- set some global variables instead! horrifying -- delicious! etc.

this year, the jam rules were "make a game/thing; the (compressed) cartridge should be 1024 bytes or less". sounds similar... except hey, compression, yeah? in general, this seems to mean you can fit about 1.5~2x as much code, because text compressors are nifty like that. this is probably the reasonable way to approach this jam -- write some code, stop pretty soon after the number passes 1024. it's much harder to make consistent progress down towards 1024 bytes after you've passed it, because it's not as simple as removing characters -- it's much more squirrelly than that. if you have two functions, that's 16 characters of text, but you only pay the character cost once when compressed. the second time you type it, the compressor says "oh yeah, I've seen this guy before, I can store this small and easy"

I studied pico8's compression algorithm, curious to see what other lua crimes I could commit to make the number go down. I got sucked in and emerged from my cave a month later, smelly and bleary-eyed, fiercely gripping a 1024-byte minesweeper clone. 10/10, highly recommended. anyway, here's what I learned:

PXA file format

I could send you a 1000 byte file full of the letter E, or I could tell you "a file with 1000 Es in it", which only took like 30 bytes of english text to say. lossless compression! that's how it works, just, you know, smarter and more complicated than that.

pico8 stores its cartridges in a custom compressed format called PXA. basically, it stores data in chunks. each chunk is a specific type, CHR, REF, or RAW:

  • CHR(char): a single character, encoded in a move-to-front sort of way. this means that recently-used characters can be stored in less space
  • REF(offset,length): an offset and a length, telling the decompressor to copy that much text from earlier in the stream. e.g. if the file was function _update() end function _draw() end, that second function _ would probably be encoded as a REF with offset -23 chars, and length 10 chars. so, repeated strings in the input compress really well! this is the biggest thing to exploit.
  • RAW(length,char,char,char...): a length, and some raw bytes. this is a sort of fallback to make sure compression doesn't ever accidentally make the compressed file a lot bigger than the input file. (this can happen! in fact it has to happen -- any individual lossless compression algorithm can't make all files smaller. otherwise you could repeatedly apply it to any input file until you get a 0 byte file, which cannot possibly decompress back out your original input). these chunks are very rare, because the other two chunk types work well to compress most normal programs.

there's a fantastic visualizer tool that shows you exactly how your code is compressed:

PXAVIS screenshot

the font's a bit hard to read, but the top is my input code, the bottom is the way the compressor chunks it up. blue is good, red is bad. The red single-char chunks at the start are the move-to-front algorithm learning what sorts of characters I'm using, then they get yellower. eventually you start to get REF blocks that span multiple characters, referencing earlier parts of the text. see that huge dark blue blob at the end? that text is an exact copy of the first part of the input, so it compresses very well -- those 71 chars (bytes) only take up 5.375 bytes when compressed!

what exactly are those 5.375 bytes? if I can understand how it compresses in detail, I can hopefully make my program smaller by writing code that's easy to compress. I spent some time studying the compression algorithm (see "appendix: PXA details" below). the details are important! but its a big dump of technical info and I'd rather skip straight to the crimes.

crimes! yes !

k so now's the fun part: coming up with new ways to make my code horrific but slightly smaller when compressed. after I got down to 1200 bytes or so, every byte was a struggle. but after learning exactly how CHR works I started thinking things like "ok so this variable appears near the keyword elseif the few times it's used, so if I rename it from w to l, then that will compress nicely with the move-to-front algorithm pico8 uses to store unexpected individual characters, because l is in elseif but w is not". it was really satisfying to see the number dropping mostly when I expected it to drop based on these sorts of variable renamings. it felt like picking up a sick new powerup.

some other harebrained things I tried, with some success:

  • changing all my loops (e.g. for j=1,8 do) to use the same index variable (j) and start number, so that the loop header compressed neatly into a larger REF chunk. doing this saved 13 bytes without changing the length of the program, which was totally bonkers
  • function is kinda free (REF chunks), so I used it everywhere. but in the end you still have to pay the cost of storing function and return the very first time, so inlining everything was smaller in the end
  • 15~=k%16-j%16and -15~=k%16-j%16and is the same size as abs(k%16-j%16)<2and because of those repeated strings, and also because b wasn't a char I used nearby. (this code is part of how I checked adjacent cells in my flattened 2D array, without wrapping around to the other side of the minefield)
  • don't store variables if you can help it. writing x=stat(32) to get the mouse's xpos and then using it ~6 times was 1 byte longer than just calling stat(32) 6 times. surprising! I think this is because I used x in similar contexts e.g. rect(x,... rect(x,..., so the x was already part of a REF chunk. so writing rect(stat(32),... instead just made those REF chunks slightly longer, which was free or nearly free
  • I tried renaming all my variables a,aa,aaa, etc, to try to keep the number of unique characters in my program down. it didn't help much -- REF chunks are 11 bits minimum, while CHR chunks are 6-10 bit almost always. what did help in the end was making a list of all keywords I was using (for, do,then,end,if,etc) and naming my vars letters from those words. saved about 20 bytes doing that. I considered naming something thend b/c of the shared substrings with then and end, but it didn't work out. some vars did end up being 2 chars long, even though many 1-char names were still available. I dream of an automated system to check all possible renamings, but luckily I stopped myself from falling down that rabbithole and never coming out
  • I spent a lot of time figuring out how to best encode my sprites. storing them in a simple char-per-pixel format works surprisingly well, since the ingame decoder is so simple to write and the encoded string compresses well (b/c the sprites are pretty regular). but in the end I know more about my data than a general text compressor does, so I was able to do much better using an RLE scheme based on code by JadeLombax
  • changed rect(0,0,128,14,3) rect(0,14,127,15,1) to rect(0,0,223,15,3) rect(0,14,223,15,1) because that 223,15 compresses better (I use the string 223 earlier in the code) and looks the same when drawn, even though the rectangles are being wayyy overdrawn offscreen

kthx for reading hope you enjoyed

I tried for a long time to get a separate 8-segment-display number font working, but I couldn't squeeze it in. oh well! that's really the only thing I had to cut. always hitting a zero on your first click would be very nice too, but I couldn't think of a small way to code that.

my goal for these jams has been to make a thing that feels good to play on its own, but then later when you hear that it's made in 1024 bytes you like, spit out your drink or whatever. I think I succeeded! but you can go play it and tell me what you think :)

the final game in action

appendix: PXA details

the pico8 wiki breaks down exactly how the bitstream works, and the source code for the compressor is available. the following quotes are from the wiki, and the tables are notes I made for myself after studying it:

CHR

If that header bit is 1, an index is read via the following:

-- read a unary value
unary = 0
while read_bit() == 1 do unary += 1 end

-- unary_mask ensures that each value of 'unary'
--   allows the encoding of different indices
unary_mask = ((1 << unary) - 1)
index = read_bits(4 + unary) + (unary_mask << 4)

This index is used as a 0-based index to the move-to-front mapping. The byte mapped by the index is written to the output stream. This byte is then moved to the front of the move-to-front mapping. (E.g. if the mapping is 0,1,2,3,4,5,... and the index is 3, the mapping is updated to be 3,0,1,2,4,5,...)

bit costn most recent chars
60-15
816-47
1048-111
12112-239
14240-255

REF

Otherwise, if the header bit is 0, an offset and a length are read via the following:

-- read the offset
offset_bits = read_bit() ? (read_bit() ? 5 : 10) : 15
offset = read_bits(offset_bits) + 1

-- read the length
length = 3
repeat
  part = read_bits(3)
  length += part
until part != 7

Then we go back "offset" characters in the output stream, and copy "length" characters to the end of the output stream. "length" may be larger than "offset", in which case we effectively repeat a pattern of "offset" characters.

bit costoffset
81-32
1333-1204
171025-32768
bit costlength
33-9
610-16
917-23
......

(total cost = offset + length cost; e.g. 8+3=11 to copy 3-9 chars from somewhere in the last 32 chars of text)

RAW

As a special exception, if offset_bits == 10 and offset == 1 (aka, the minimal offset is encoded with more bits than necessary), this is interpreted as a start of an uncompressed block. When this happens, right after the offset is read, 8-bit characters are continually read as groups of 8 bits each (without any byte alignment), until a null is found (0x00).

bit costlength
221
302
383
......

bonus! BWT sorcery

have you heard of the burrows-wheeler transform? it's completely messed up. essentially you "sort" your input file and then compress that instead. so, function _draw() cls() print("hi") end becomes something more like ""((()))_accddefhiiilnnnnoprrsttuw (something like that, but not exactly). BWT isn't actually a simple sort, but it does something similar that clusters characters together in an easily-compressible way. as the wiki page says:

The remarkable thing about the BWT is not that it generates a more easily encoded output—an ordinary sort would do that—but that it does this reversibly, allowing the original document to be re-generated from the last column data.

that's not relevant to pico8 compression at all, it's just a wild thing I came across while learning about compression for this.


You must log in to comment.

in reply to @pancelor's post:

the first thought i had was "what if i wrote a poem that could be compressed as small as possible?" sometimes i find myself trying to avoid repeating words in my writing for fun, i put a short poem i wrote with no repeating words into the source compression visualizer and it happened to be all red. no compression possible. i thought to myself, this can't be a very good compressor then! but then i changed a few characters and it suddenly compressed a few characters. somehow this poem is just an all-red poem. it seems like a totally different fun sort of challenge:

write something that pico-8 can't compress at all, but still does or communicates something!

I love this! never would have thought of it myself.

without having tried it yet, here's some strategies I would expect to be useful:

  • use lowercase instead of uppercase if you can; lowercase letters start out more expensive in CHR chunks until the compressor gets used to the characters you're using
  • shorter text (overall length) is more likely to not compress, because the more the compressor "warms up" (e.g. my first point here) the better it'll do
  • don't repeat words / substrings
  • try to rapidly switch between lots of different characters. e.g. acababcacbbbwxywzywwwxywzyzwxxxy has two different "sections", one that uses 3 chars and one that uses 4 chars. your "sections" won't likely be separated this neatly, but try to use >16 chars per section, and ideally >48 chars per section
Pinned Tags