One thing I think needs to be considered is throwing out data from 1999 and 2000, and possibly 1998 and 2001. Clearly these are outliers to the data set. I'll be running a little regression analysis on all this and I'm almost certain that those years, or parts of, would need to get the boot.