InvestorsHub Logo
icon url

HailMary

02/13/04 10:57 PM

#26211 RE: calbiker #26199

Yes they are complimentary and do not conflict. But that's not the issue here. The problem is that Opteron has an integrated the memory controller. This is great for low latency memory/processor operations, but not for I/O systems. Opteron is pin restricted. It doesn't have a PCI Express bus. But it has a HT bus, which then must be converted into a PCI Express bus. This takes extra latency and is not efficient.

Opteron currently has 3 HT links. 2 for interprocessor communication, and 1 for communicating to the chipset. Suppose they drop the 1 HT link that is currently used to connect to chipset and replace it with a PCI express interconnect. Perhaps this is a possible plan for them.

Even without this, is translation going to be that big of an issue? Are we going to go back to the place where video cards start relying heavily on system memory again? The bottleneck today is not the pipe between the video card and the CPU. It once was, but with so much function moved to the graphics card, it is less important. Latency is not really an issue for transfering textures from main memory to video memory - bandwidth is. So even with a translation hub, I don't think AMD is going to be hurt by having an on die memory controller and a PCI Express translation hub with the next generation of video cards. I'm not sure why you think they wouldn't work at all.
icon url

sgolds

02/14/04 1:31 PM

#26239 RE: calbiker #26199

calbiker (updated), I'm glad we agree that the core CPU performance of Opteron (execution of instructions from memory) is unaffected by any HT to PCI-X latencies. On PCI-X & HT latency, you wrote:

Yes they are complimentary and do not conflict. But that's not the issue here. The problem is that Opteron has an integrated the memory controller. This is great for low latency memory/processor operations, but not for I/O systems. Opteron is pin restricted. It doesn't have a PCI Express bus. But it has a HT bus, which then must be converted into a PCI Express bus. This takes extra latency and is not efficient.

There is a presentation on the AMD site, you can find it here:

http://www.amd.com/us-en/assets/content_type/DownloadableAssets/dwamd_RichBrunnerClusterWorldpresFIN...

Take note starting at p. 17. Notice that the PCI-X bridge has to translate to the North Bridge and then go over a second hop to the processor. On p. 18 you can see that Opteron eliminates one hop: The PCI-X bridge connects directly to the processor. It would seem to me that the AMD64 architecture would have less latency than a traditional North Bridge approach. Also consider that the North Bridge has the additional overhead of contention between memory access and PCI-X access and the disparity gets greater.

Now, PCI-X is not the same as PCI Express. However, the architectural picture here still applies. Simply substiture the PCI Express bridge for the PCI-X bridge and the processor side of the picture should be the same. PCI-Express is a fabric connection to the peripheral devices on the other side. Video controllers will directly hang off the PCI-Express as one connection, legacy PCI-X, PCI, AGP, USB, etc. will connect in as additional channels. As the processor-side connection looks the same as PCI-X, the additional North Bridge latencies apply here also.

If I can find a similar picture actually showing PCI Express tying into HT, I will post.

By the way, I had a look at the very nice presentation from Xilinx on PCI Express, but I do not see where they distinguish between a North Bridge vs. HT approach. If there is a particular slide I should look at, please advise.

UPDATE:

I see there is additional information in that AMD presentation about PCI Express. P. 21 and 22 show an HT tunnel that is used to go to PCI Express, directly translating between the two technologies. This shows the architecture I projected above: Without a North Bridge, one hop is still needed rather than two.