wbmw: Your condescension is unnecessary. As I have developed compiler back ends which included code optimizations similar to what is done in TMTA’s CMS, I think I am qualified to speak with a modicum of authority on these matters.
No kidding? I wouldn't have guessed it. How recent was it and what was the target architecture? If it was recent work and the target was a VLIW or a wide issue in-order superscalar then perhaps you could explain how did the work of Mahlke and Hwu influence your optimization strategy? If not then what relevence does it have to the discussion of the potential architectural speed-up of the 8000 over the 5x00 series?