Wbmw:
Considering that going from one SuperPI task, which runs completely in cache, to two reduces NGA's performance by 15%, I consider that as extreme proof that the shared cache is adding 15% to performance going from 2MB to 4MB. OTOH DC K8 going from one SuperPI task to two has little reduction in performance.
Another thing is that Intel doesn't want to release any DC scores which heavily hit the main memory. Perhaps the FSB can't supply both cores when the working set can't fit into the cache. Then the performance is mostly cache related and NGA becomes a dog when the going gets hard (large working sets and/or multiple running tasks). Also no 64 bit scores have been released. Could NGA be a dog in 64 bit too?
Like I have said before, these selective releases don't point to much wrt NGA's performance in the real world. You'll be disappointed when NGA hits the public meat grinder. But like always, you'll ignore all the tests that prove you wrong.
Pete