I would also wager that if a similar analysis was done with trials using PFS or OS as an endpoint, the conclusions would be analagous.
You'd be hard pressed to come up with enough examples where, on top of the article's requirement for "identical therapeutic regimens" where the Phase II trial was also randomized and had the same endpoint as the Phase III trial.
Of the explanations for the response rate difference, the two that seemed most compelling to me were the single-arm trial problem, and changes in how "response rates" were measured based on improved technology. An interesting analogy (explained to me by a studious friend of mine) is the PET staging issue with trials in advanced NSCLC. The new technology is moving folks from Stage III to Stage IV, and mucking up comparisons with earlier trials. We're not necessarily getting better survival in newer trials -- it's just effectively a different definition of Stage III that's being used.
Again, I'm not denying the validity of program survival bias. I do think that there are a LOT of qualifiers in terms of the utility of this particular article as support for the theory.
“The trick is in what one emphasizes. We either make ourselves miserable, or we make ourselves happy. The amount of work is the same.” Carlos Castaneda