Advanced Micro Devices Inc (AMD): Here's what TSCP is trying to measure:...

Reply Private New

Replies (2) Next 10 Prev Next

Send PM Follow Ignore

Followers	0
Posts	316
Boards Moderated	0
Alias Born	05/21/2004

upc

Re: subzero post# 41713

Monday, 08/09/2004 3:55:41 AM

Monday, August 09, 2004 3:55:41 AM

Here's what TSCP is trying to measure:

gcc is considered a branch intensive program. As you can see from this graph, TSCP has even more branches and they're harder to predict, so it's a good test of a processor's BPU and ability to recover from mispredicted branches. TSCP also has relatively high ILP, so it tests the processor's instruction scheduler. It clearly fits in L1 cache, so it doesn't test a computer's L2 cache or main memory performance. Basically, TSCP measures a processor core's worst case integer performance. It may be a good predictor for compilers, other AI programs, and other branch intensive code.

I would think Prescott, with a deep pipeline, would not perform well when encountering lots of branch mispredictions.

ubench:

Please make sure you compile ubench using only -O2 or -O optimization flags. More aggressive optimizations tend to alter the semantics of the code and skew the results.

This also appears to be a very old program (the benchmark table does not contain recent hardware), and there was this description and caveat about a previous version:

Other factors affecting ubench results include quality of the C-compiler, C-library, kernel version, OS scalability, amount of RAM, presence of other applications running at the same time, etc.
Ubench is executing rather senseless mathematical integer and floating-point calculations for 3 mins concurrently using several processes, and the result is Ubench CPU benchmark. The ratio of floating-point calculations to integer is about 1:3.
Ubench will spawn about 2 concurrent processes for each CPU available on the system. This ensures all available raw CPU horsepower is used.
Ubench is executing rather senseless memory allocation and memory to memory copying operations for another 3 mins concurrently using several processes, and the result is Ubench MEM benchmark.
The following are the samples of ubench output for some systems. Attention: The MEM benchmarks for all Linux systems had to be adjusted by a factor of 8 due to the bug in ubench version < 0.32. The MEM benchmark for all AIX system had to be adjusted by a factor of 4. The bug has been corrected in version 0.32. The benchmarks submitted before 07/31/2000 have been recalculated.
As any benchmark the ubench numbers by itself have no meaning and can be used only when comparing to ubench marks from other systems.

And this is what Anand did: We compiled the program using ./configure and make with no optimizations.

Now, looking at the configuration file, I wonder if they built this correctly for either processor?

I wish they provided makefile output for both like they did with some other benches they built.

John the ripper:

This one is crazy. They couldn't build it the first time, and the generic build appears to try out different code versions and it attempts to self-optimize! There would seem to be a lot that could go wrong here.

It's going to take awhile to figure out what's going on here. Look at the output makefiles. In particular, the "bitslice" , "intermediate values" and "blowfish" tests are the subtests that radically favor Nocona.

Hard to know, without a response from AMD, if these are accurate results, or a compiler optimization issue.

upc