That process shrinks will eventually allow larger caches to take up less die area.
I can see that, but x86 will also be moving to larger caches and multicores using this same rule, and probably be adding more execution units to each core, a new way of doing floating point, etc..
I'm wondering when an IPF part would perform better than a similarly sized x86 part in all aspects. Is there a crossover point where more cache for an x86 part doesn't make much difference, but it still does for an IPF part?
Today there is a large disparity in performance per unit of die size with current IPF implementations. You double the die size, but only gain 30% in fp performance, and possibly lose in integer performance when going from x86 to IPF. You could design a multicore x86 with a shared cache and still be under the Madison (6MB) die size, and it would likely outperform it in every benchmark, and in some cases kill it.
I think the metric for mainstream parts is performance per unit of die size. IPF has to at least be somewhat competitive in this metric for broad adoption, don't you think?