"disassembly" is a procedure by which you can turn an unambiguous sequence of bytes that have one interpretation into a mushy structure that takes extra work to figure out how to execute. i do not like the wrinkle of being able to disassemble 3300 into (xor, eax, [rax]) because that representation also suggests your disassembler could one day say (xor, [rax], [rax]). but that will never happen! so you develop checks and edge cases for things that were literally impossible without this mushy intermediate representation.
i don't know what to do with this and i'm not sure if it's actually better for anyone if the computer could inline the exact logic to handle some decoded instruction and not a word more. maybe the explosion in generated code is worse overall. buh.
i've been marinating over @dougall's comment here and continuing to stew over "using a disassembler makes my gbc emulator extremely slow". the real problem i'm dancing around in OP is "uses of a disassembler are usually specialized". print this instruction as text counts as specialized, for example. SO FAR AS I CAN IMAGINE, really generic "i'd love to have an IR describing this instruction" stuff only becomes useful when you're doing code analysis, and even then at the level of a disassembler it's still "specialized" in that it's just "for this instruction, produce that IR. then sometimes over there print an instruction's text as well".
generally, "a disassembler" would need to be split into "decode these bytes" and "do something with the decoded bytes" phases. but because many details about an instruction are found piecemeal as you decode bytes (e.g. you often figure out an opcode far before operands), it looks a lot like callbacks as part of a parser - because it is a parser, i guess...
so then you would have a function to decode bytes, parameterized on something providing a bunch of handlers, and "by default" that might be "put everything into a generic Instruction struct". then for specialized cases ("i'm scanning only for jumps, calls, rets"), you could leave every handler other than on_opcode_determined as a no-op.
if this all works out, it does address a wrinkle i've always been disappointed by w/ yaxpeax-x86: it's a horrible length decoder. rustc isn't smart enough (for good reason) to know that if a user only inspects the length of an instruction, it doesn't need to codegen most of the "save results" part of decoding. but in theory this would leave lots of branch arms entirely empty and ripe for the dead code eliminating.
how horrifying. i guess i'll try it out on yaxpeax-sm83.