InvestorsHub Logo
icon url

dacaw

11/20/04 11:06 AM

#47783 RE: dacaw #47782

. . . and a corollary

Thinking about it - that non-OOO fp proc are highly sensitive to micro-optimizations - it would seem reasonable to me that compiler writers would put lots of effort into the codegen for oft-used SPEC sequences.

But of course intel would never tweak benchmarks to favor their own processors, would they????
icon url

chipguy

11/20/04 12:05 PM

#47786 RE: dacaw #47782

In contrast Athlon optimizes the fp code at run time by moving the ops around as resources come available. Thus micro-optimization by the coder is pointless - you really don't see any difference by tweaking the odd line here or there - or sprinkling nops around.

Wow, so much misinformation in just two sentences. First
of all, OoO execution does not relieve the compiler of
the need to perform sophisticated code generation and
scheduling to good good performance on a superscalar
processors. Compare the performance of gcc vs vendor
compilers for EV6, PPC970, and P4. Gcc sucks for Alpha
nearly as much as it does for IPF. Second of all, many
OoO processors use NOPs to improve code scheduling
and nearly all superscalar processors, OoO or otherwise
benefit from using NOPs to align the code at the top of
loops with the cache line. Finally, OoO is of very little
benefit for most FP intensive code. It tends to have very
predictable control flow and memory access patterns.
OoO execution is also of little benefit to many important
commercial workloads like OLTP because the memory
component of CPI dominates. OoO is primarily of benefit
to SPECint and general purpose PC type applications.