Is the segmentation such a big deal? Isn't it just adding some registers together to get an effective address?
The cruft list:
* Number of registers * x87 brain damage * variable length instructions * bizarre old instructions like inti and bound * increment and decrement (which don't affect the flags!) * those top-half-of-16-bit register things * unaligned load store (which can be atomic even across cache line boundaries!) * predicated loads and stores (VIA didn't do this in some versions of the C3)