Yourbankruptcy, I was looking for actual benchmarks, not theoretical discussions of architecture. Dan3 tried, but as usual, he purposely pointed to programs that weren't supposed to run fast on Itanium (i.e. 32-bit programs running in emulation mode).
<4-way Opteron works practically as a single chip with 4 cores on die.>
Uh, no. For Opteron, every memory access must be snooped (or probed in AMD terminology) by all other processors in the system. Thanks to the ring structure of the HT links, this broadcast ensures that every memory access has the latency of at least four hops (two to go to the furthest processor, two to return).
Face it, the latency increases with every processor added to an Opteron system. Of course, Opteron will make up for this deficiency in added bandwidth per processor, but that's far from claiming this is equivalent to a quad-core chip.
In other words, I see nothing but a "logical" argument based on wacky assumptions. Nothing is new under the sun.
Tenchu