There is reason/evidence for belief that 2-73 is better.
The issue is that it is not conclusive. I classify it as suggestive due to the small numbers involved and the lack of more complete data to allow for independent analysis.
P2B/P3 will give a more definitive answer to the question I hope.
I understand all of your criticisms of the data and the possible ways it could be misrepresented or cherry picked. That is why I say the evidence is not conclusive but suggestive.
Still I see the balance of available information as being positive for 2-73.