InvestorsHub Logo
icon url

jhalada

11/19/04 3:19 PM

#47747 RE: wbmw #47745

wbmw,

The difference between today's DP Xeon and IPF SPECfp scores is about 80%. I'm also using the 3M cache version, so that the idea of fitting SPEC within cache is less of an issue.

It is 56% on a single CPU system with 3MB L2 for Itanium, 1MB on Xeon, which, with second FP unit and equal cache would be well within reach.

You can clearly see three floating point units: FPMUL, FPALU, and FPSTORE. Granted, these have specific purposes, but they are designed for the common case, which needs to load and store data as math is performed.

Itanium has 2x of those. Each floating point unit is capable of doing simultaneous FMUL and FADD.

I can't imagine getting much more performance by adding a fourth unit.

Addition would be a second set of these. Or, if the whole FP subsystem is redesigned, it may take a somewhat different shape. Bottom line is that Itanium has resources to do 4 FP operations simultaneously, while Opteron can do only 2.

Joe



icon url

dacaw

11/19/04 3:44 PM

#47749 RE: wbmw #47745

SPEC, cache & memory

From the same site as the Opteron fp unit discussion another article about the brouhaha over spec scores:

Quote: "The SPEC 2000 benchmarks are subject to much debate in the scientific community. Are they broken? Do they just depend on memory bandwidth? Do they fit entirely in the cache? "

http://www.chip-architect.com/news/2003_08_29_Cache_efficiency_for_SPEC2000.html

Note the comment in the final para:

"The memory footprint of the SPEC2000 benchmarks is less then 200 MByte to be able to run on systems with 256 MByte DRAM. Heavier applications using multiple Gigabyte structures are likely to see much greater degradations. AMD's distributed memory solution based on HyperTransfer links is likely to pay of in these cases. A four processor 2200 MHz Opteron may reach a similar SPEC2000_rate performance as a four way 1500 MHz Itanium 2 even though the latter has a much higher single processor score. Again, larger floating point memory footprints may skew the results even further. "