Discussion of higher sample rates

Well, whether you use Ardour or Cakewalk you might want to watch this:

3 Likes

Completely off-topic, but since you bring it up (and ignoring use-cases like archival mastering and streaming services like Tidal, and things of that nature), audio works similar to video in that if you record at higher resolutions, you increase perceived quality when it’s downsampled, which is why, for example your favourite YouTubers will film their videos in 4K, even if they export at 1080p (although 4K cameras weren’t really a common thing 10 years ago when your link was posted, but that’s even farther off-topic).

Anyway, if you do have any Ardour performance suggestions, let me know!

The key word here is perceived : as in “your brain believes it sounds better because it’s told that the audio was downsampled from 960kHz/256bit, as opposed to a measly 48kHz/24bit”.

Here’s the relevant xiphmont link regarding 192kHz audio
https://people.xiph.org/~xiphmont/demo/neil-young.html

Youtubers nowadays seem to do the opposite to what you’re saying: film in 1080p and then upsample to 4K before uploading.
Why? Because Youtube re-encodes at higher bitrates when the source appears to be 4K, so instead of encoding your 1080p at, say, 500kbps it instead encodes it at 2000kbps.
So it’s a hack to trick the encoder into giving you higher bitrates.

/end of off-topic

So there are actual physics that support down converting from higher resolutions for video that don’t exist in the same way for audio. Specifically when downconverting video, depending on the method used, you get an increased color depth per pixel and a don’t lose as much detail information due to debayering of the sensor.

These don’t exist in audio. What you can get is less ringing of the low pass filter due to it not having to be so steep, but chances of you hearing this in reality are very slim. There are also corner cases involving aliasing of high frequencies outside of our human hearing that create audible artifacts in our human hearing, but again chances of this making an audible difference that people are capable of hearing is very limited.

Where higher sample rates due come into play is when studying things outside of human hearing, such as animal communication, echolocation, etc. they also can make a difference in specifically generative processes such as reverb and some synths, but there are generally better ways to handle this like oversampling in the plug-in.

All that being said, is it hurting to do 192k? Nope, but it probably isn’t helping much either honestly. So work the way you want of course, just understand the actual benefits vs marketing benefits of different workflows. And of course have fun!

 Seablade
1 Like

As for Tidal, it’s possible that they have improved their system in the last two years but it seems they used to deliver something that wasn’t exactly the “hi-fidelity” it said on the tin.

The audio case can not be compared with the video case (4K - 8K recording).

In video there is more detailed information in the light than the eye can see. In audio there is no audible information to record above 20 kHz.

If one wants to make a fair comparison one should record ultraviolet and/or infrared light along with visible light and claim that downconverting the video to include only visible light brings more fidelity to the visible picture.

So downconverting from 192 kHz to 48 kHz brings no benefits, there is no information that could be brought down to audible range from above 20 kHz. There is a use case though, if you are planning to slow down the playback of audio.

These parameters are approximately the same in video and audio
Resolution in video = bit depth in audio
Color space in video = sample rate in audio

Yes, my brain hurts reading all this :slight_smile:

That being said, it does add overhead. You do need more disk-space and more CPU to process at higher sample-rates. Also downsampling to 44.1k or 48k does introduce artifacts. They may or may not be inaudible.

Sorry that isn’t a completely accurate description of the issue at hand.

Video has discreet points/pixels generated from a bayered sensor in most cases. But the resolution of video still has not quite yet outresolved our eyes in most cases (Note this depends not only on captured resolution, but also displayed and at what distance/size we are talking about), and in fact our eyes are better at accommodating dynamic range of light than most sensors are capable of yet.

The difference is that a pixel in video, is not analagous to a sample in audio in the same way. Audio samples are used by a math described in the Shannon-Nyquist theorem to mathmatically recreate sounds and curves that would fit the track of the samples. In video this isn’t really the same as rather than regenerating analog curves between points, we are recreating the points themselves to make the pixels, and those pixels still aren’t as good as our eyes, whereas the curves generated by the nyquist theorem are as good as our ears can resolve (As we can’t hear beyond certain thresholds in terms of frequencies and it is already capable or resolving dynamic range of our lower limits of hearing to the threshold of pain).

Seablade

I think he was talking about pixles on a screen, which are conceptually infinitesimally small.
You can also dither and scale images.

Hehheh, yea this discussion does circle around a few times it seems.

Theoretically this should not be any more artifacts than the LPF needed for this purpose I believe though correct? Meaning it would be comparable to artifacts from the LPF needed to record at 44.1/48k?

Your points about additional disk space and CPU are absolutely correct, though we are rapidly reaching a point where that is less of an issue these days.

Seablade

I’ll just point to https://src.infinitewave.ca/

Sadly I see the opposite happening.

Single CPUs cores are already at the physical limit, and parallelism does not help to run plugins in series. The seemingly abundance of computing resources resulted in an overall decline of DSP code quality. For pro-audio performance, a modern M2 with modern software is on par with a Thinkpad from 2012 with software from that era.

2 Likes

True, but in general I would hope we use decent resampling these days for such a thing, but of course that could be a bad assumption on my part.

Not sure I agree with your assessment honestly. You obviously aren’t wrong about running single threaded plugins in series, but the sheer amount of plugins I can run took a significant jump forward with a transition to M1, and while I haven’t tested my M2 mini yet for this, the M1 was nearly to the point with audio plugins where I just stopped thinking about DSP in general, and wonder how much longer I will have to answer those questions (Obviously not completely done yet, just looking ahead).

Compared to 2012, where I definitely had to think about DSP usage to an extent still (In terms of multiple convolution plugs, or other DSP heavy resources).

Now the tradeoff in terms of quality… yea I can’t argue that more resources has made people a bit lazier in terms of lean programming, I don’t believe enough (yet) to counteract the benefit I saw in going to M1, but hopefully it doesn’t get there (Yet my mind tells me it is definitely possible).

Seablade