maybe i will simply use this account to subject you to rabbit holes that are relevant to exactly zero people. i warned u about following me & now you will suffer the consequences!!!
The Toon Boom Harmony TVG file format
as you know, people often use wicked sorcery to make objects move as though they are alive, in a process called “anmiaiton.” however, sometimes they will also use “computer technology” for this purpose, such as “Toon Boom Harmony,” a relatively popular 2D animation software. I find Toon Boom Harmony is notable for its very nice vector drawing tools, and especially its vector pencil tool.
The Harmony pencil tool creates strokes in a format not possible in other common vector graphics formats such as SVG: a Bézier spline with variable width. The thickness data is another Bézier spline, making this a Bézier-Bézier offset curve.
This way, you can just adjust your lines freely without worrying about messing up your line thickness, and vice versa. I think the people at Toon Boom are aware that this is pretty neat, because they make you pay extra for this feature.
Now, sometimes, you might wanna take your Toon Boom Harmony project and export it just a little bit. Just get that data out so you can use it somewhere else.
You can render it to a raster image, but that’s no fun. You lose all the benefits granted to you by the vector format. Toon Boom Harmony also lets you export to PDF, but PDF, frankly, sucks, and it also does not preserve the strokes created by the pencil tool, because such data cannot exist in PDF (I think they get converted to outlines).
So maybe what you really want is to be able to read the files the drawings are stored in themselves. And by you I mean me because most people probably really don’t care.
While the Toon Boom project files are in semi-human-readable-ish XML, the drawing files are not. They’re in a proprietary binary format called “TVG” which I assume stands for “Toon Boom Vector Graphics” or something. I had a look around, and this format is not documented. It seemed nobody had tried to reverse engineer it either. So I decided I might as well have a go!!
The Toon Boom Harmony license agreement forbids that you “6.1.5. modify, reverse engineer, decompile, disassemble, or create derivative works from the SOFTWARE or its proprietary source code,” so I decided not to do that because i would probably go to jail forever. Also, frankly, reversing stripped & optimized code sucks. Probably. I imagine. not that., i would have ever, done anything ofthe sort ,
So, I’m not a lawyer, but I imagine it would be legal to treat the software as a black box and simply examine the file format itself using files I created, since those files are not part of the software.
Let’s open it up!!
that sure is lots of binary data! There are several interesting things of note here already:
- The file magic is probably
- For some reason, TVG files contain a “certificate.”
I, writing this right now, have future knowledge: this certificate is tied to your software license and is the same in every file you create. It’s identifying information, so I blurred it out.
I don’t know… why… you would do this? Putting, like, a certificate of authenticity™ in every single file a user creates? It’s certainly a very strange choice.
TVG seems to use a common pattern found in binary file formats:
4-byte tags, followed by a length, and then followed by the data.
I think it’s so nice of them to use string tags like this instead of,
like, numeric enums.
I don’t know what
ENDT are supposed to mean,
tOAA (off-screen) are very likely
the data for the four layers in a toon boom drawing
(underlay, color, line, and overlay).
Every tag is also accompanied by another 4-byte tag that either says
Given that ZLib is a compression library,
this is probably the data encoding.
Since we can read the
UNCO data just fine right here,
this tag probably means “unencoded.”
There’s also another piece of identifying information here in the
TVCI tag (toon boom vector… creator… information…? maybe?):
the hostname of my computer and the software name.
The hostname is again a very weird thing to include.
Beware of sharing toon boom projects, I suppose…
Scrolling past the huge zlib blob, you can find a few more things at the end:
There’s another zlib blob in the
which is probably the color palette.
This is followed by
which seems to contain the byte offset of every listed tag,
(toon boom… table… of contents…?)
and finally some kind of cryptographic signature.
It seems all the interesting stuff is in the ZLib-compressed data… guess it’s time to unzlib some stuff! I started with the palette data since that seemed easier.
Basic Palette Data: Reversing Is So Easy
I would like to note it took me an unreasonably long time to decode the ZLib data, because I thought I could just shove those bytes into ZLib and it would work. It did not work. It took me several hours to find out (including reading the ZLib specification… because I wasn’t sure this was ZLib data at all!) that the first 4 bytes of the data are not, in fact, part of the ZLib data. They are just another length value. Specifically, the length of the uncompressed data. Only the bytes after that can be shoved into ZLib and decompressed properly. So that was a thing.
Anyway, un-zlibbing the palette data already reveals several immediately readable things:
You can see some text like “Black” (the name of the color) and “2022-05-18” (the name of the project I stole this TVG from). Since every second byte in the text is zero, I am going to assume that this is UTF-16 LE.
Again there are 4-byte tags followed by length and data, so after annotating a little bit…
you can see that the color is actually just made from two tags:
00 00 00 FF, which is just the color’s
actual value in RGBA format (in this case, black).
TCID seems to contains identifying information
like the name of the color.
Toon Boom projects actually very conveniently have the palette
data also available in text form inside
The one for this project reads:
ToonBoomAnimationInc PaletteFile 2 Solid Black 0x0a46da1a56b5abe6 0 0 0 255 Solid White 0x0a46da1a56b5abe9 255 255 255 255 Solid Red 0x0a46da1a56b5abec 255 0 0 255 Solid Green 0x0a46da1a56b5abef 0 255 0 255 Solid Blue 0x0a46da1a56b5abf2 0 0 255 255 Solid "Vectorized Line" 0x0000000000000003 0 0 0 255
That very suspicious hex number there can also be found in the data (marked green). This is probably some sort of internal color ID.
The only thing left to figure out is the 10 bytes at the beginning. If I open the TPAL data for a drawing that uses two colors, the first byte at the beginning changes to a 2, so that’s probably the number of colors.
And finally, for the
79 00 00 00 00 00… well, I have no idea.
This seems to just be some sort of header before every color entry.
It doesn’t look important, though.
Well, that was easy! Surely the layer data in
tLAA will be no different!
Continued In: What The Fuck Is This Number Format