Why Does Audio Sound Different On YT? (Comparatively Elsewhere)

I’ve noticed that the compression and audio-player for Youtube is different than elsewhere.
What exactly does Youtube use?
I’ve also noticed subliminals lose their potentcy even if downloaded in WAV/FLAC, regardless of total format.

Maybe it’s something Android-related or media-card related?
I’ve listened to subs on Android devices via Samsung Music and VLC…with VLC being the better quality. (using audio-output-method - OpenSL ES instead of AudioTrack)
Then, I’ll resume back on PC - getting better results on PC, regardless of media player.

But, someone may ask “What’s so special about Youtube?”.
I’ll tell you…I’ll test subs on Android via Youtube - with their results being signifcantly better than their downloaded counterparts.

Maybe there’s no other way to say it…other than information is lost from conversions?
Not neccessarily true, because I’ve noticed the same exact phenomena from subs I’ve made.

All replies welcome for discussion. :grin:

YouTube transcodes the video you upload. That process can discard the higher frequencies.
(Your video editor may also be guilty of this, depending on its settings).
A before-after upload comparison of the audio spectrums will show how much is discarded.
( The losses are negligible for normal audio, but it may remove ~ultrasonic magic spells ).

1 Like

Hmmm, interesting. Thank you for replying.
Is it possible to Spectrum Analyze Youtube video audios without downloading them?
Just to compare and whatnot.

I don’t think this image answers my questions…but according to this, there is definitely an issue with generation-loss.
Meaning, if a maker uploads subject-matter, chances are - it already has lost information, then downloading again would make it like a pseudo-Version-3.
Version 1: Exported audio that was mixed down from Editor
Version 2: Exported version then uploaded to website
Version 3: Website version, then becomes the “downloaded-version-obtained” as MP3/WAV-16 BIT, 48Kh
Version 4: Upscaled version, WAV, 32-BIT, 44.1Kh
Conclusion: Audio-degradation with each version (maybe 50%-100% change with each transcoding).
Note: Raw-files exist before V1 and by the time someone gets to V4, there’s been a 200%-400% shift in audio information, effectively making it viably unrecognizable. Therefore, it would be fair to call it… “an-entirely-different-audio-track”.
Even attempts at upscaling will result in digital information “being estimated”, meaning it could have the same format but still wouldn’t generate original data.
The only way to avoid this - is with the original-version properly uploaded with a direct-download to Version 1…

Does this make sense, or is there something I’m missing? Would there be a Version 5 if I transferred it to my phone-via-PC (device jumping), potentially rewriting the code - effectively adding up - between 250%-500% information changes?

Plus… Perceptual Compression works (partly) by trying to throw-away sounds that you can’t hear, including sounds that are masked (drowned-out) by other sounds.

If it’s working properly, it should throw-away any subliminal information.

2 Likes

You can use Audacity to record YouTube play back, rather than use YouTube-to-mp3 websites which could be of lower quality, (and downloading them risks malware infection).

1 Like

Woah, interesting stuff.
This essentially means some people who download subs (depending on which subs) are actually working off of placebo then, obtain results slowly…or they simply complain their results never happen. :open_mouth:

I mean…I always knew that - “compression usually works at the attempt of getting rid of unnecessary information”…but hmmm…I gotta dive into this.

Basically, you’re saying - this is a far better acquisition-method than downloading…and this information obtained from internal-recording is absolutely viable for listening, spectrum analysis, etc.? :face_holding_back_tears:

Okay, here are some images for spectrum analysis, granted these are photos and they do not resemble the actuality-of-observing-personally (IRL).

As you can see, the top image is the subliminal Internally Recorded, and bottom image - Downloaded Version.
I don’t know if I analyzed correctly…but here I go.

The sound perception is similar, meaning a person could theoretically listen to both versions and say “It’s the same thing.”, but I digress.
The Internal Recording is more data-concentrated with fullness, whereas the Downloaded Version is less data-concentrated with some kind information shift (plus lossage) as a result of compression. However, all ranges of frequencies are presumably compromised and massively affected (to whichever extent), when the two are compared. Each range is so different, it simply wouldn’t matter if one was pointed out.
The only artifact that may be present in the Internal Recording is the frequency-rate it is recorded in - as you can clearly see the colors on the image are brighter and more vibrant. What else could it be?
It appears as though the Internal Recording has very minimal artifacts added, if any, and minimal audio-information loss, if any, from website - resulting in a high quality feed.

Conclusion: I don’t know if the Internal Recording is as good as the targeted-subject-matter…but it might as well be. For the sake of conversation, you can casually determine (by all accords) which one is more powerful and feels stronger - just by looking at the two.

Any feedback welcome. :hugs:

Evidently the >15kHz magic has survived YouTube.

The louder the sound the brighter it is represented on the spectrogram.
For a fair comparison apply loudness normalization to both before & after.

Again, Top is Internally Recorded and Bottom is Downloaded version.
Percieved Loudness normalization was applied as -4 LUFS, stereo channels independently, to both audio-versions.



Overall, loudness normalization did well to noticeably improve the Downloaded Version, whereas the Internal Recording remained the same.
Downloaded Version still displays significant changes comparing to Internal Recording.
Top images are more “cleaner” and sharper (still brighter), whereas the Bottom images are more dispersed and “blurry”, although previously - I referred to them as “data-concentrations”. This is a result of compression via transcoding methods.
Note: The reason I didn’t normalize loudness prior - was because I wanted to “show both versions as-unedited-as-possible”.
If I normalize loudness, then information - that’s supposed to be present - becomes present - enlargens from it’s shrunken state (and cannot revive data that’s permanently removed).

Conclusion: The comparison, itself, is “more fair” after edits - even though my intention for the comparison was for them to remain untampered with.
Both comparisons still prove that - downloaded subs are compromised, unfortunately.
I was pleasantly surprised with improvements, however…the Internal Recording is still superior.
The signals in Top images are stronger and more present - which is what’s wanted from high quality sound.

A popular question may be “To what extent, are downloads not as potent?”;
Answer: Overall, depends on sound design. But, apparently it’s enough that sub results are affected.