calbiker, I am glad we are in substantial agreement! There is something that I don't understand from your issue:
But now, here's the rub. HT needs to be converted into PCI Express. There's a translation latency as well as the standard PCI Express latency.
1. Isn't there a similar translation latency in Intel's North Bridge to convert from PCI-Express to the processor bus format?
2. So far we are talking about the idealized case where the only thing happening is a PCI-Express access. Now, what happens when you mix in an overlapping memory access from the processor? Seems to me that the P4 will suffer a degradation in latency and throughput because both requests & both data accesses happen on the same processor bus (vs. Opteron, which has a separate memory bus).
By the way - don't know if you noticed my other post replying to the suggestion made today that a future Intel North Bridge may incorporate the PCI-Express fabric. In that case, P4 video would greatly benefit (route to DRAM bypasses the processor), although Opteron will still have a clear edge in memory-to-processor access. In that theoretical matchup the question of who's faster would be highly application dependent.