The short-short version is that the higher the associativity, the more separate blocks of memory can be cached. The disadvantage is that the higher associativity results in a higher latency.
AMD's AthlonXP's L2 cache has a 16-way set associativity.
P4 and Banias are 8-way. Celeronized P4 is 4-way.
Two additional points:
- Intel uses a 4-way L1 (data) cache, whereas AMD's are all 2-way. - Intel's L2 cache is "inclusive", meaning it contains all the data of the L1 cache as well. AMD's L2 is "exclusive", which (oddly enough) means it doesn't contain information found in the L1 cache.