ok things were getting difficult because i was making bad design decisions
really waiting to hit a "you have designed this in such a way that feature X will never work"-- a wiser, previous me
sooooo
- compute branches in X but only assert the jump request in M
- move all CSR reads and writes into M
- flush pipeline + frontend (with jump to PC+2) on SR write
- this is ok because SR writes count as "instructions that change PC" and are thus illegal in a delay slot
- asserting branches in M gives the one extra cycle needed to have SR.MD updated in time for the frontend to check if an access is legal
- PC+2 should be nice and available in M since it's the required value of SPC for a completion type trap
- treat banked register reads like CSR reads
- the encoding is literally like this, and it means no passing around bank flags to everything
only downside is we'll have a two cycle branch penalty. i don't reaaaally want to do any branch prediction, and really this is in line with the original design (even though the original design could execute 4 rather than 2 instructions in those cycles)
