The metallic sounds are file-size-compression artefacts, (made consspicuouss by excesssive ssibilance).
https://audio.com/anonymous-audio/sssurvive-before-after
De-ess first (before exporting)… Updated De-Clicker and new De-esser for speech - #199 by Trebor
then when exporting avoid using low bitrate (low quality) settings.
The artefacts are in the range 4.5kHz to 7.5kHz, depending on the frequency-response of the transducers on the device, (earphones, speakers), the volume of artefacts could be enhanced or diminished.