Regarding program-survival bias, phase 3 trials often have a similar internal problem between arms. This problem is often solved by adjusting results for stratification factors. I'm not saying that this would solve all potential discrepancies between phase 2 and 3 trials but some DD into this could be a positive move.