Some of that software is pretty bad and for some reason still in use.
Nowadays AI (for example open source package called "whisper" available on github) does is so well. I've been using it for a while now and it rarely makes a mistake in a much tougher setting than what conference call is (e.g. in videos with background music, lots of noise etc, multiple people talking simultaneously, etc.)
I will post the next RVNC conference transcript transcribed by whisper, so you can get a sense of improvement over "standard" technology. Unless it becomes the standard by then and everyone starts using it.