Yes, I still use the same hard disk platter as a drink coaster. But I need more ISA cards in my collection.

posts from @cr1901 tagged ##itanium

also:

Catfish-Man
@Catfish-Man

I greatly enjoyed this article about how to design CPU instruction sets. Lots of interesting perspective that I hadn't properly considered before.

Probably my favorite bit was just a bit of straightforward arithmetic that I could have done but hadn't thought to: branch mispredict rate * (reorder buffer size / typical basic block length)

With a seven-stage dual-issue pipeline, you might have 14 instructions in flight at a time. If you incorrectly predict a branch, half of these will be the wrong ones and will need to be rolled back, making your real throughput only half of your theoretical throughput. Modern high-end cores typically have around 200 in-flight instructions—that is over 28 basic blocks, so a 95% branch predictor accuracy rate gives less than a 24% probability of correctly predicting every branch being executed. Big cores really like anything that can reduce the cost of misprediction penalties.

I knew branch prediction was critical (Dan Luu's article on that is my favorite), but hadn't internalized just how fast the numbers get bad for big out of order cores.


cr1901
@cr1901

Quoting somebody else: "Itanium failed because compilers that generate good code for it cannot exist".

(What is meant here is that "the average basic block length of real programs- 7 instructions- is too short for VLIW to take advantage of. And smart compilers/assemblers/programmers cannot salvage that." The one exception is DSPs, which are well-suited to doing math without branching, and thus DSPs are (were?) VLIW.)