Experimental support for floating double precision 64bit (IEEE 754) on JackAudio/Ardour?

cauldron · August 1, 2019, 1:41pm

Most applications do not exceed 24 bits of precision so 32bit float does not degrade quality with 24bit mantissa.
Some applications can use 64-bit float precision like the historical Csound.
Furthermore, several modern A/D D/A converters support a 32bit INTEGER resolution (in addition to DSD), in particular the AKM AK557x and AK449x converters.
It may be useful to read an article by the founder of Mytek that explains the reasons to go beyond the 24 bit PCM: https://mytekdigital.com/download_library/Beyond_24_bit_Michal_Jurewicz_Resolution_2014.pdf

There are probably few applications that support 64-bit double precision. What is JackAudio/Ardour intentions about it?

An intermediate transport-only measure could also be using 32bit integer instead of 32bit float and perhaps it is already applicable as an experimental configuration of JackAudio/Ardour?

x42 · August 1, 2019, 2:54pm

Given that the thermal noise floor at room temperature is at about -127.1dBu (@48kHz bandwidth), it makes little to no sense to be more precise on the ADC/DAC. Most sound-cards only have 18-19 valid bits to begin with.

With float operation, the noise added due to rounding errors after N worst-case operations is 20 * log10 (N * (2^-23)) dB. That is also not significant for any audio application.

DSD is a nice concept, but it’s not really suitable for editing, nor useful during production.

Most people who argue for the need of 64bit (or 192kHz) just want you to purchase new equipment.

That being said, the datatype in JACK is a compile-time define. People have changed JACK to use different datatypes. e.g. video-jack uses an raw image. You could change it to “double”, however that will make libjack incompatible with existing applications.

In theory Ardour could be changed, but it’s not very likely to happen. Ardour development is not driven by “larger number are better” sales marketing.

PS. 32 bit integer is worse than 32 bit float for various reasons and especially on modern CPUs.

ccaudle · August 1, 2019, 3:46pm

What about the old Harrison XRange series? Those used 64 bit processing, the argument at the time being that there may be a huge number of channels being summed for film soundtrack work, so keeping quantization noise ultralow could be a benefit. I’m pretty sure XRange was built on an early version of Ardour. I looked around the Ardour source but never could find any indication that there was a 64 bit audio option included in the standard source. I emailed Ben one time to ask about that at one point but never got a response. It was mostly idle curiosity on my part, so I didn’t pursue it any farther.

mike3 · August 1, 2019, 3:49pm

Don’t confuse the converter resolution with the processing resolution of the DAW. There are very different needs and limitations involved.

ccaudle · August 1, 2019, 3:52pm

I should clarify that I never really bought that argument. In the absolute best case (as in test equipment, no way to reproduce with “real” sources) the noise would be about 120dB below max level. If you summed 1000 channels then yes, the quantization noise even at -144dB would become pretty large, but if there were any signal on that channel then at best the analog noise would still be over 20dB above the quantization noise, so making the quantization noise 100dB below the analog noise instead of “only” 20dB doesn’t buy anything. The real solution, which film mixers have known for decades, is you mute the channels which are not in use at the time, so then it doesn’t matter what the noise is on those channels.

mike3 · August 1, 2019, 3:58pm

I’ve reached an age now where I would much rather there was some new music I actually wanted to listen to, than a signal chain which can exceed the limits of physics (and my hearing)

cauldron · August 1, 2019, 4:35pm

If you listen to pure analog sound then you may be surprised by the (bad) converter you will add to the chain. Even a good audio interface like the Apogee Element may not be enough. You have to go to converters for mastering. If the converter is very good you could also be satisfied with 192kHz / 24bit or 96kHz / 24bit and the problem does not arise.
However, some ADC DAC chip manufacturers are moving towards the 32 bit integer PCM and today they are increasingly popular and low cost. This 32bit integer is information that the converter needs to work at its best.
Whether you like it or not you will have to deal with the hardware that will be there as it happened with operating systems that exist 64-bit only.

I quote what was stated in the 2014 article indicated:

“…DSD offered additional depth and resolution, although increasing the sample
rate of 24-bit PCM to 192k brought the formats closer. 5.6 MHz DSD is better
sounding still. We continued our experiments trying to understand if and when
PCM can challenge DSD. While 384k brought the quality closer, it was not
until we started toggling between 24 and 32-bit depth that we heard a major
improvement. It was clear that in a clean digital chain with 130dB dynamic
range, 24-bit, even dithered, is the bottleneck.
Further analysis of the architecture of modern A-DC and D-AC chipsets
shows that 32-bit decimation output is the information subset of a very high
performance quasi-DSD front end. This allows the capturing of another 48dB of
additional detail depth. Going back to 24-bit decimation reduces detail. 32-bit is
needed to clean up the modern high performance digital recording PCM chain.
32-bit at a minimum of 352.8kHz (DXD32) PCM would be needed to compete
with DSD sound quality…”

“The next generation of PCM/DSD convertors from Mytek will have 32-bit
integer output and at least 384kHz FS in addition to 11.2MHz DSD. Almost
all current digital interfaces can only transfer 24-bit. The current AES-EBU and
SPDIF digital audio interface is capable of 24bit, 192kHz only.
Fortunately with USB, Firewire, Thunderbolt and upcoming AVB network
protocols, AES-EBU style streaming interfaces are becoming less relevant,
as computer interfaces can shuffle any amounts of bits at any speed if
programmed to do so. So despite most computer audio drivers being capable
of transferring only 24-bit, they can be easily amended to feed 32-bit integer audio into 64-bit floating point OS environments. We are currently developing
such an implementation for Mytek drivers and have begun collaborating with
some high-end DAW software companies to implement it on the DAW side”

paul · August 1, 2019, 5:01pm

Robin has already pointed out that DSD cannot be edited, which makes it dead on arrival for a format within a DAW (and by implication, as the output of an ADC).
Most of what you’ve cited from that 2014 article is a good demonstration of pro-audio people starting to sound like the idiot audiophiles. They never double blind test anything, and when someone does and demonstrates a failure to discriminate, they insist that double blind testing, the gold standard in very other domain of human sensory evaluation, is somehow “wrong” or inappropriate for audio.
Robin already pointed out that 32 bits of information exceeds the physical limit. Nobody wants to record Brownian motion. Using an integer format for samples means that in the event that you do in fact overflow the range of the integer (large scale summing, for example), you’re screwed. That’s why historically every DAW (more or less) has used either fixed point or floating point. There might be some CPU/bus related reasons to use a 32 bit-sized entity when moving samples between the CPU and the converter, but once you’re inside an application, any integer format has issues that were solved decades ago.
the notion that it is somehow challenging to feed 32 bit integer audio into a 64 bit floating point environment on the CPU is just absurd, and fundamentally bullshit. Alternatively, it’s a reflection of a closed source world where you have to take what the manufacturers give you and nothing else. From a technical perspective, it isn’t interesting. Your converters are never going to provide more than 20 bits (if that) of useful information. You can package that however you want (DSD, PCM, 24 bits, 64 bits…) but it doesn’t change the amount of information that’s available from/to the conversion process.

Changing Ardour to use 64 bit floating point is relatively easy to do. We’re not going to do that, because there is simply no good reason to do so.
As Mike said, go make some (good|better) music. The technology is just fine.

lenovens · August 2, 2019, 4:33pm

Really… You have personally done this I suppose. Under which conditions? using what test equipment?
Be careful spouting advertising copy designed to sell over priced coffee, err, audio equipment. 48k/24bit use for recording, pretty much assumes the recording is peaking at 18db down-ish to make sure of no clipping ever while still retaining at least 16 bits of real information (well probably 14 bits more realistically considering the noise floor of most studios) 96k/24bit means half the tracks, half the disk space, half of everything for no gain at all. 64 bit float just cuts the usefulness in half as well for no gain what so ever. So 192k/64f means I have a system that is 1/8th as useful in every way but sounds exactly the same as if I had used 48k/24bits… well that is not exactly true, the 96/192k version may create more distortion in play back and and of course will include twice or four times the noise. All of this seems like a loose, loose, loose for high sample rate, high bit depth recording.

cauldron · August 3, 2019, 1:53pm

hello Len,
at least the chain consists of a microphone, a microphone preamplifier and a headphone with its amplifier. all in the analog domain … then add an AD DA converter to evaluate the analog reconstruction. use your voice and your ears.

microphones:
SE Elecrtonics Gemini II, RN17,
Earthworks QTC50, Blue Hummingbird, Neumann KM131

preamp:
Blue Robbie

headphone:
Beyerdynamic DT880 / DT770

headphone amp:
Violectric HPA V200

Apogee Element AD / DA goes well at 48khz, at 192khz it’s better but the analog sound is more fluid and dimensional. it’s a qualitative difference.

two converters are coming that should improve transparency compared to pure analog sound:
RME ADI-2 PRO FS and Mytek Brooklyn ADC / DAC

it’s likely that I use DSD256 for archiving while for post production editing I do a PCM conversion with HQPlayer.

seablade · August 3, 2019, 4:53pm

What you are hearing is the difference in the conversion process and quality of the converters, not a difference in bit depth in the conversion process, which has been mentioned several times would make no difference whatsoever.

I like Apogee’s converters especially personally, but they have a characteristic to them that I enjoy, is not the same as RME etc.

Of course as mentioned before part of the issue is your testing procedure. It isn’t that I don’t doubt you are hearing a difference as you are adding a processor to the chain and every processing operation will have some effect, but whether you can quantify the difference as good/bad/etc. might surprise you when doing a true double blind where neither you, nor the tester, is aware of whether you are going through the analog or digital chain. It would also help to use a recording (Analog in your case I suppose) instead of a live microphone as there are multiple variables it will remove from the equation as a result.

      Seablade

lenovens · August 3, 2019, 5:58pm

I don’t see any mention of double blind setup in your equipment list. Until you add that, no testing has yet been done.
Also remember that ADC and DAC both have a large percentage of analog components inside that also contribute to what you hear. Just as analog mixing consoles are chosen for how they change the sound rather than how transparent they are, converters are also chosen for a “pleasing sound”. This pleasing sound is not given by sample rate or bit depth but rather the analog components at the input and output ends.

So the only measure here is the ears and the standard is that it “sounds” pleasant. Not some measuring device that shows input and output are identical such as inversion and mix for example. There is no step to ensure psychological stuff like “the number is higher so it must be better” or “I paid more so it’s better” is removed. I’m not sold. Also be aware that the analog parts of a converter may not be the same from one sample rate to another and so you may be hearing the different analog parts rather than the rate difference. Even the digital handling inside the converter will be different for each rate.
Considering all these variables, from a manufacture’s point of view, it would make sense to make sure all of these things lined up to make sure that the higher sample rate the person pays for sounds at least different but also more pleasing.
I would also note that the analog mics, preamps etc. are not instrumentation gear, but rather recording gear chosen for the way they change the sound in a pleasing way rather than perfect rendering from acoustic sound to analog signal. Headphones never sound like speakers as they lack range, particularly at the bottom end where hearing may come from the soles of the feet as much as the ears. And speakers… I have never seen a well defined frequency chart (where all the information is actually there) with something approaching flat. Just like the mics we use.
Still smells like snake oil.

mike3 · August 3, 2019, 8:03pm

at 192khz it’s better but the analog sound is more fluid and dimensional. it’s a qualitative difference.

When citing perceived differences between sample rates, you also need to be meticulous in making sure all of your signal chain can cope adequately with the entirety of the frequency range reproduced. Otherwise, you might (reliably) be able to tell the difference between the same material rendered at different rates, but perhaps because the higher sample rate material contains ultrasonic frequencies high enough to provoke audible limitations within the rest of the (analogue) signal chain. Conventional wisdom is that you should design systems which can cope with a wider than necessary frequency range, precisely so that you can restrict their behaviour, to an appropriate range, in a controlled and predictable manner. For the most part that is already possible for audio using 48kHz and even 44.1kHz rates.