News Focus
News Focus
icon url

chipdesigner

05/10/06 12:43 PM

#4819 RE: mmoy #4818

Yep, that and the "memory disambiguation" (allowing better load reordering), particularly for int performance.

Determining whether a load and a store share the same address is called memory disambiguation. Allowing loads to move ahead of stores gives a big performance boost. In some snippets of benchmarking code, Intel saw up to a 40% performance boost, solely the result of the more flexible way Loads get reordered. It is pretty clear that we won't see this in most real applications, but it is nevertheless impressive and it should show tangible (10-20%) performance boosts together with the fast L2 and L1 cache.

http://www.anandtech.com/cpuchipsets/showdoc.aspx?i=2748&p=5
icon url

mas

05/10/06 12:54 PM

#4823 RE: mmoy #4818

Indeed. Fwiw the uarch improvements to K8 were designed to make the K8 more robust in certain workload mixes, cure pathological situations and allow it to clock to 3+ GHz. OTOH the improvements done to Conroe encompassed these kind of fixes but also made the core much fatter and beefier at the same time. In its case I would say the uarch improvements are worth 2/3 of the gain with the cache/prefetching the other 1/3. You really have to take each case on its merits.