A P2 trial is not evaluated like a P3 in terms of the primary endpoint being a brightline on failure.
A P2 provides data on how and if to run a P3. If the OS was trending well it is very possible that they would run with a P3 based on PFS and RR.
My point is that they have the exact same data, stat sig PFS/RR not OS, that they would have had if PFS had been the primary (except for the stat purists who insist the secondaries do not matter when the primary failed).
So the decision to go should be the same as if the trial had PFS as the primary and the news was "it worked".