InvestorsHub Logo
icon url

wbmw

02/25/04 6:45 PM

#27320 RE: HailMary #27313

HM, Re: I can see that, but x86 will also be moving to larger caches and multicores using this same rule, and probably be adding more execution units to each core, a new way of doing floating point, etc.

By all accounts, Tukwila will be a brand new IPF micro-architecture with the innovations of the ADTers built inside. I don't see much that can be added to x86 that can stand up to that.

Re: I'm wondering when an IPF part would perform better than a similarly sized x86 part in all aspects. Is there a crossover point where more cache for an x86 part doesn't make much difference, but it still does for an IPF part?

I think so. Take a Madison and Gallatin CPU with 1M of L3 cache, and the Gallatin seems to outperform. However, I think comparing both CPUs with 6M of cache would show Madison to be the victor. We are already seeing diminishing returns for x86 CPUs with large caches. Outside of SPEC, the Pentium 4 Extreme Edition has few applications where it can boast more than a few percent gain over the equivalently clocked Northwood. The scalability characteristics are simply different between x86 and IPF, but most people assume they are the same. That's a logical fallacy.

Re: You could design a multicore x86 with a shared cache and still be under the Madison (6MB) die size, and it would likely outperform it in every benchmark, and in some cases kill it.

Doubtful. The Gallatin core has 2M of cache and the size is close to 240mm^2. A dual core implementation with 4M of shared cache would be 480mm^2, and you wouldn't want to put less than double the amount of cache on a dual core CPU. Madison is ~370mm^2. At 90nm, when dual core becomes more feasible, Intel is already designing a dual core IPF chip with 24M of cache - Montecito. It will probably be more than 500mm^2 unless they somehow found a way to compact the core or the cache quite a bit smaller than it is now. I'm sure you could do many things to an x86 core with that kind of budget, but the argument is kind of moot, since the best way to take up die area right now is to add more cache, and 24M of cache is not going to help an x86 processor to perform better.

Re: I think the metric for mainstream parts is performance per unit of die size. IPF has to at least be somewhat competitive in this metric for broad adoption, don't you think?

As long as it is going into a market with four-figure ASPs, the die size is irrelevant. By the 2007/8 time frame, 45nm features will allow far more to be done to the design, and I think the added size required for IPF implementations will be small relative to the allowable budget, and that would allow for more mainstream uses.
icon url

chipguy

02/25/04 7:24 PM

#27331 RE: HailMary #27313

I think the metric for mainstream parts is performance per unit of die size. IPF has to at least be somewhat competitive in this metric for broad adoption, don't you think?

First all the cost that is important to the real customer is
the system cost, not one of the components. In the server
market that IPF currently resides in, IPF processors go
up against CPUs like POWER4+ and USIII and US-IV
which require big off-CPU caches (128 MB per MCM, 8
or 16 MB per chip respectively). With the exception of the
IBM x455, all IPF systems ship with no cache except that
which is on the I2's die. That is a radical departure
from the last 15 years of server design and contributes
to system level cost savings.

Looking at microprocessor die size itself, you have to
remember that IC cost goes up with size for two different
reasons. The first is you get fewer chips per wafer and
it costs the same to produce a wafer with lots of little
parts as a wafer with a few big parts. The second, and
often more important reason is there is a greater chance
of a killer point defect in a large chip than in a small
chip. But some chip elements like cache can be designed
with redundant elements that can be substituted for areas
rendered useless from a point defect. With proper design
a 400 mm2 processor that's 75% cache by area need not
have worse defectivity yield than a 150 mm2 chip that is
33% cache by area. Let's call the portion of a chip that is
not redundancy protected the "critical area" since defects
in that area are almost always fatal while defects in redun-
dancy protected regions are almost always fixable.

The 374 mm2 Madison I2 has a somewhat smaller critical
area than a 131 mm2 Northwood P4. Although a Madison
will still cost about 3x more to make than a P4 (say $120 vs
$40) simply from wafer area usage, it is not the 5x or 10x
more to make if it was simply a scaled up x86 design like
P4 that is over 75% critical area.