I've done special code for MMX, SSE and SSE2 for Mozilla and it's a pain in the neck to maintain code sets for all three. It can get out of hand as well as you have different characteristics of processors within families. Earlier Pentium 4s had different latency characteristics than later Pentium 4s.