Voice highlighting on record

Hi and sorry for my English!

If need to decrypt the text, I use the YouTube function, where auto-subtitles are added to the video, which remains to be slightly edited. But there are records that seem to be without noise and with a normal sound - but YouTube translates them into text with a lot of omissions and errors. Apparently, some parameters in the sound do not allow the YouTube function to correctly recognize speech on such recordings.
Question: how to select speech as much as possible on recordings in the program in order to more or less acceptable equalize all the components of the voice?

Google provide some advice for auto captioning: https://support.google.com/youtube/answer/6373554?hl=en-GB
If you are starting with poor quality audio, I doubt that it will be possible to achieve accurate captioning. You need to start with a good, clear recording, without overlapping voices.