What's the advantage of 2 independent parallel memory controllers vs. 1, more complex one?
More overall memory transactions per unit time given requests from four separate CPUs. By treating each DIMM channel separately a cache fill uses twice as long a data burst. Like DDR, DDR2 has turn around delays which means longer bursts give you better bandwidth utilization at a slight cost in latency. Each channel also operates independently allowing two different requests to be serviced at the same time if they map to the two different channels.
BTW, the EV7 has two separate memory controllers on chip, each controller used half the Rambus channels. It was a single CPU chip but the dual memory controllers helped system performance scale to 128 CPUs without undue interference between local and incoming memory requests.