Advanced Micro Devices Inc (AMD): wbmw, I am calling Intel's MP strateg...

Reply Private New

Next 10 Prev Next

Send PM Follow Ignore

Followers	5
Posts	3418
Boards Moderated	0
Alias Born	03/10/2003

jhalada

Re: wbmw post# 72227

Sunday, 05/21/2006 12:53:26 AM

Sunday, May 21, 2006 12:53:26 AM

wbmw,

I am calling Intel's MP strategy weak, which is to take a mobile/client optimized core and try to scale it up to servers. This is a great strategy for reusing an architecture in the maximum possible sense, but it does not adequately address the AMD competitive issue where they create an MP scalable architecture first, and then find ways to scale it down to the client and mobile segments.

I think where you and I disagree is in what you call "core". In my opinion, core is what performs calculations, and it is not I/O. You seem to be mixing it with I/O.

The role of the core is to provide maximum performance at given power consumption levels. There is an evolution, as new stepping and new shrinks become available.

AMD did create MP scalable architecture (well scalable very well to 2S and 4S, it is open to discussion as far as 8S), but that was independent of the core. The core could have very well been Northwood or Yonah inside the chip, with IMC, crossbar and HT links. It happened to be K8 core.

As far as the K8 core is concerned, it did ok vs. northwood at its time, excellent vs. Prescott of that time, not so well against Banias at the same time frame, ok vs. Dothan, excellent vs. Smithfield / Paxville, well against current 65 nm Netburst shrink, so-so vs. Yonah and it looks that Rev F of K8 will not do that well vs. Merom/Conroe/Woodcrest.

But what I was trying to point out is that the core develops independently, and MP I/O/connectivity develops independently. The cores are not optimized for MP. They are optimized for computational performance and power consumption.

At the same time, Intel is very disadvantaged in the MP segment, because their cores do not have the features to scale upward.

Again, it is not cores, it is I/O, namely shared bus that does not scale upward. The cores are innocent. FSB is the guilty party.

Going forward, we have a short term period where both companies will be exposed in their weak points, IMO. AMD will be exposed in mobile, and to some extent, the client space as well. Intel will be exposed in MP servers, and to a lesser extent the DP space. The DP space will be interesting, since I believe Intel has most of the core architecture necessary to get very good performance here, and AMD will be hard pressed to beat them.

Intel will do well in DP space. Woodcrest looks great based on the latest rumors. I think Opteron will perform relatively better in 64 bit apps vs. 32 bit apps, because I think Woodcrest's memory disambiguation, which improves IPC greatly will help a little less in 64 bit code, since there are fewer load and stores in 64 bit code (thanks to 16 registers).

In the longer term, I believe Intel will need to adequately address the server segment with CSI and their version of the integrated memory controller (most likely FBD based). This could happen as soon as Nehalem, but we will see how strong of a server offering Intel will have in this time period. Likewise, it sounds like AMD is busy developing a mobile optimized core, which will increase their strength in this segment. So it will be interesting in the longer term how the competitive landscape shifts. But I believe in the short term that each company will have definite strengths and weaknesses.

Performance-wise that's probably generous of you to say, since it looks like Merom/Conroe/Woodcrest cores will have an overall lead over K8 core, including Rev F. We will need to wait a month or 2 to see where these competitors stand. AMD will have to wait for K8L to regain the lead. It appears to equal most of Conroe's strengths, and retains IMC, which should push it over the top, assuming respectable clock speed scaling and power consumption management. But there is a lot of potential for hiccups in clock speed, power consumption and time to market, so AMD ragaining the lead is not guaranteed. At the same time, Intel may still have some low hanging fruit on Conroe, so it will be a moving target.

I am quite certain IBM, Dell, Fujitsu, and HP will all support Tulsa with at least 1 sku. Intel is ponying up most of the work, so it would be stupid of them not to at least offer the choice. Charlie may be right that Tulsa is uninteresting to some of them, but that does not mean they will ignore Intel's still staggering amount of share and influence in this market. It might be, for example, that they won't implement Tulsa in their 4-way blade systems, as they have done with Xeon MP processors in the past, but I do believe it will be in the volume offerings for sure.

I am not sure how staggering Intel's share will be in this particular segment. By the time Tulsa ships, I think Intel's share will be less than 50%, may be as low as 30% (strictly speaking of 4-way).

There are many micro-segments of the market, some of them which demand RAS as their #1 concern. Others will demand performance as their #1 concern, while still others will demand power or price. I don't think there is any single "rule of thumb" in terms of the ordering. You have mission critical servers, large scale business and financial servers, broad based webservers a la Google, small application and print servers, etc. If you ask Google, they will tell you that they'll take whatever can offer the maximum compute density for the floor space and power delivery they can provide. If you ask Wall Street, they will demand whatever has the maximum uptime for a given year. I could go on, but you get the point.

But you do now realize that it is far less consequential than you believed, and you lead others to believe. Especially when it comes to exotic things such as hot plug of CPU etc. And there are software approaches being developed in parallel to improve the uptime.

I believe AMD has already announced that they will have both a server optimized and mobile optimized core at some point. I believe their to be both pros and cons to this approach, just as there are pros and cons to having a single shared core. On one hand, a single core micro-architecture can enable the best reuse, the best technology focus, and the best time to market. On the other hand, having diverse cores means you get to optimize for the different strengths of each segments without having to live with the tradeoffs of a jack-of-all-trades. Go ahead and give the server chip a large die size, or power hungry features. Go ahead and give the mobile chip more P-states and a very aggressive voltage level. Of course, then you'll have to deal with repeating a lot of work. You'll have a technology that integrates very well into the server core, but takes extra design work to shoe-horn into the mobile core. You'll want to be consistent in your feature set, so you'll have to live with some features being dischordant with your micro-architecture. Maybe your mobile core should be in-order to get <5W power envelopes, but that will really hurt your single threaded performance. Lots of things to consider, when you have two separate designs.

I think that the thinking in the server market has changed greatly. Power consumption is now as much of a concern as in mobile. Most servers spend most of the time being idle.

I did hear about AMD developing mobile optimized core as well as server optimized core. But that may be 2008 and beyond. Who knows if they make it to market. The near term plan is K8, K8L across the board. Something could possibly be chopped of for mobile version of K8L, but I would not bet on it. BTW, Intel seems to have the same approach near term. NGA across the board, except Itanium, which may or may not be put out of its misery.

I do not believe K8 makes a good mobile chip for one thing

K8 was better than Netburst, worse that Banias, about the same as Dothan, worse than Yonah. That does not mean it is completely incompetent. It just mean it fell a little short behind Intel, in the segment where Intel was most competitive.

I think K8L will be much better. The separately scalable voltage planes will help, but I think Intel will have the ultimate power to scale their voltages, right from the beginning.

I think the issue here is that the memory controller may have had troubles co-existing with rapidly changing voltages. The dual voltage will separate the 2. Intel did not face this issue, since the memory controller is on the chipset. I think someone alluded to a possibility of higher clock speed scaling on the other end, with the voltages separated.

Intel also has the best technology for P-states and low latency power management. I remember seeing one review showing that it takes AMD 100x longer to switch between sleep states than it does Intel. That's a hard thing to overcome.

Do you have a link? Is it something applicable to NGA or Yonah? The reason I am curious because I thought the opposite was the case, up to Dothan. I have not seen anyone explore the issue. I just went buy some benchmarks where Turion did better than Dothan on power consumption (which was surprising). Those benchmarks seemed, at the time to me a kind where a lot of rapid switching could have put Turion over the top.

Joe