Beamer, > AMD claims that their L3 uses this policy. From what I have seen, these are easier to design, since the L2s simply write back to the L3 instead of memory, but they really don't improve locality all that much.
A victim cache is MORE complex than an inclusive cache, not less complex. Your write back example is faulty, because in an inclusive cache, the L2 doesn't have to write back anything.
A victim cache makes the most sense if the size is limited. That was the case with Duron, because the L1 cache was larger (128K) than the L2 cache (64K). That is also going to be the case with Barcelona, since the total size of all the L2 caches combined equals the size of the L3 cache.
By the way, I just noticed that AMD is REDUCING the size of the L1 cache from 128K to 64K. That's interesting; the relatively large size of the K7/K8 L1 cache was a defining characteristic of that architecture. I guess AMD had no choice but to go back to a more conventional L1 cache size if they are to implement reasonably-sized L2 and L3 caches.
Tenchu