News Focus
News Focus
icon url

wbmw

09/11/13 1:59 PM

#122496 RE: mas #122486

GPU Performance Study

I wanted to list out the subtests of GLBench, and how Bay Trail compares with a couple of others. The interesting take-away is that subtest performance is often dramatically different from overall TRex/Egypt offscreen performance, which is not intuitive.

Note that Shield and the MSM8974 (Snapdragon 800) MDP are both overclocked platforms designed to do well in the benchmarks, but do not tend to hit these scores on similar real-world tablet designs. But I'm including them for the sake of comparing the subtests to the overall score.

All are offscreen versions of the tests.

BTrail NV-Shld iPad4 MSM8974
GLBench 2.7 Fill 1.0 1.11 1.77 0.99
GLBench 2.7 Triangle 1.0 1.21 1.82 1.16
GLBench 2.7 Fragment 1.0 0.88 1.86 1.26
GLBench 2.7 Vertex 1.0 1.23 1.67 1.13
Subtest Geomean 1.0 1.10 1.78 1.13

GLBench 2.7 TRex 1.0 1.50 1.00 1.63
GLBench 2.5 Egypt 1.0 1.54 1.22 1.66
Test Geomean 1.0 1.52 1.10 1.64


These are interesting results. Based on the subtests, Bay Trail should score within 10-15% of the nVidia and Qualcomm platforms (and beat the real-world tablet equivalents), and at the same time get trounced by the iPad. But in the actual tests that people pay attention to, it's the other way around.

I don't have an answer here, but there are several potential hypotheses. For one thing, the subtests rely more on fixed function portions of the GPU, which are required to support more FPS while rendering, but do not so much expose the relative performance of the execution units. On the other hand, TRex and Egypt are more realistic 3D scenarios that need the shader throughput to do well. Therefore, hypothesis #1 is that Intel skimped in shader count on Bay Trail (only 4 of the EU units), as opposed to Qualcomm and nVidia, who both built out their shader units in spades. And if this is true, then at least it says that Intel's underlying architecture is good, but they need to lay out the die area for more EU's.

Of course, this doesn't explain why the SGX 554-MP4 in the iPad4 goes in the other direction. For that, the answer may lie in the "glue" or "fabric" around the 554 architecture, which may scale well on more fixed function operations, but scale really poorly when it comes to shader throughput. So Hypothesis #2 is that Apple may in fact have internal bandwidth bottlenecks that prevent their graphics core from delivering a TRex or Egypt score that is in line with what they build into the chip. And if that's true, then Apple/Imagination only needs to fix these issues in the A7X part to get a rather large improvement in performance. Based on claims that the A7 and iPhone 5S get 2x GPU performance, this seems in line with what I would expect. The A7 also likely implements the Imagination 6th generation core, which not only fixes the internal bandwidth issues, but likely does a multitude of other improvements.

In terms of Intel's architecture, in the next generation, they should implement no less than 8 EU units, and if they really want to leap ahead and lead over their competition, then 12 or 16 is more appropriate. Combined with the 8th generation core improvements, I think a 12-16 EU Cherry Trail would be a killer graphics part. If you consider that Ivy Bridge is 16 EU, and scale it down to 14nm with the improvements, and I think Intel can have a very effective cost structure as well as a very high performance graphics engine.
icon url

chipguy

09/11/13 2:08 PM

#122498 RE: mas #122486

let's see what Airmont brings.

If mainstream x86 progression is a guide then Intel will direct
most of the gains from 14 nm to the GPU.

These reviews suggest that would be most appropriate.