Calculating RMS in digital audio

abyss · January 30, 2024, 11:27pm

Ah. I needed the baby steps, Robin: (X * X)/(Y * Y) = (X/Y)*(X/Y).

ccaudle · January 30, 2024, 11:38pm

No, the audio measurement standard AES-17 is explicit that db FS is defined for amplitude, and is defined as 20log(signal_rms/full_scale_sine_rms).

Any measurement given in dB is a measurement of the signal relative to a reference value. The reference value for full scale in this case is a sine wave which just reaches full scale. As lherg pointed out the RMS value of a sine wave is 1/sqrt(2). Dividing by 1/sqrt(2) is of course the same as multiplying by sqrt(2), so you can re-write as 20log(signal_rms*sqrt2).

The factor of 2 pointed out in the first post is just algebraic simplification of where the “divide by 1/sqrt(w))” factor is used in the calculations.

x42 · January 30, 2024, 11:38pm

Now for the practical part.

One advantage of moving this factor of 2 inside the log as sqrt(2) is that the (sum/rms_buffer_size) will also have a range of [0…1]. That value that can be easily transmitted reliably (no -inf for silence) .

Furthermore a common function coefficient_to_db(v) → 20 log(v) van be used throughout the codebase.

x42 · January 30, 2024, 11:49pm

you guys are saying exactly the same thing

ccaudle · January 31, 2024, 12:34am

Yes, but I find the derivation of the 10log(power) vs 20log(amplitude) obscures the point that dB values always have a reference, and in this case the reference is explicitly the RMS of the amplitude of a full scale sine, so you need to divide the signal value by the reference sine value and take the log of that.

x42 · January 31, 2024, 1:19am

One of us has this backwards

From a physics point of view, the reference is signal power ratio, and only by applying ohms law, you can derive the log of squared voltage ratios (if R is constant). This way you actually calculate the RMS of a sine wave to be 1/sqrt(2).

And only then can the AES come along and tell you to put sqrt(2) in there to normalize it.

In other cases however it’s not as simple. e.g. for LUFS calculation you have to actually integrate and cannot just move a constant normalization factor into the log(). In LUFS loudness DSP you’ll find 10 log (…) for that reason (page 3 https://www.itu.int/dms_pubrec/itu-r/rec/bs/R-REC-BS.1770-5-202311-I!!PDF-E.pdf).

–

All that being said, for simple RMS, I agree that from an engineering point of view the AES definition is easier to grasp for casual people implementing it.

abyss · January 31, 2024, 3:18am

I will stoke the flames of confusion.

for (unsigned int i = 0; i < rms_buffer_size; ++i) {
           sum+= 2.0*rms_buffer[i]*rms_buffer[i];
}
rms = sqrt(sum/rms_buffer_size);
rms_db = 20*log10(rms);

So let’s consider a mathematical transformation (if I did it right):

for (unsigned int i = 0; i < rms_buffer_size; ++i) {
           true_sum+= rms_buffer[i]*rms_buffer[i];
}
rms = sqrt(2) * sqrt(true_sum/rms_buffer_size);
rms_db = 20*log10(rms);

What is the sqrt(2) for? I am not familiar with AES-17, but even if I were, does it make clear what the sqrt(2) is for? As Robin said, I believe, it is a factor one can apply TO A SINE WAVE’s maximum amplitude to get the RMS amplitude.

Now look at the algorithm. Are we sampling a sine wave? No, we are not. If we have a sine wave, we use exact math in the style of integration of calculus. Audio signals are much more irregular. I don’t think one may ‘normalize’ or convert from maximum amplitude to RMS amplitude with the constant sqrt(2) when one is sampling some generic audio signal.

Others may be a by-the-book person if they wish. I’ve read enough books and don’t read academic stuff any more than necessary. The question as I understand it was one of understanding the why of how it works or would work correctly, not in how to follow an algorithm given from on high. I realize all too well I am the oddball that way. I am not going to assess AES anything, so of course my opinion is just some questions that I think are relevant to a solid understanding of the original question. Good luck to everyone implementing audio signal processing.

But now I realize something. This from Chris is interesting:

No, the audio measurement standard AES-17 is explicit that db FS is defined for amplitude, and is defined as 20log(signal_rms/full_scale_sine_rms).

What if we let full_scale_sine_rms = 1 / sqrt(2) ?
Then we get 20log(sqrt(2)*signal_rms) or:

rms = sqrt(2) * sqrt(true_sum/rms_buffer_size);
rms_db = 20*log10(rms);

READ THIS, lherg:
It would seem the repetitive ‘2’ is the unclear forward reference of the reference RMS amplitude of a maximum sized sine wave.

ccaudle · January 31, 2024, 4:09am

A value in dB is not absolute, it is the logarithm of the ratio of a measured value to a reference value.
In the case of FS measurement, the “FS” referes to full-scale digital, and dB FS is defined to be the logarithm of the RMS signal value to the RMS value of a sine wave which has a maximum amplitude of the maximum digital value (the maximum numeric value will vary depending on how many bits the audio samples contain).

Not quite. When you calculate the RMS value of a sine wave with peak amplitude of +1 and -1 the value is 1/sqrt(2).
That is the reference value, so when you calculate the ratio of the measured signal RMS value to the reference sine wave RMS value you calculate signal_rms/1/sqrt(2).
When you normalize that fraction you would write it as signal_rms*sqrt(2).

You are not converting max amplitude to RMS of the audio signal, you are calculating the ratio of the signal RMS value to the RMS value of a maximum amplitude sine wave.

When you calculate dB values it is always a comparison to a reference value. dB FS happens to use the RMS amplitude of a full scale sine wave as the reference, dBm uses 1mWatt as the reference, dBu uses the voltage which dissipates 1mW into 600 Ohms as the reference, dB SPL is the sound pressure level referenced to 20 micro pascals. There always has to be a reference value against which you are measuring.

Well, yes, that is explicitly what I wrote in my previous comment.

abyss · January 31, 2024, 9:36am

2^1/2 is indeed a magical number of the RMS gods. RMS is an energy field created by all living digital audio sources. It surrounds us and penetrates us; it binds the DAW galaxy together. Those who can channel that energy are luminous beings, not crude matter such as myself. Its high priest knows the holy specs and vouchsafes to provide many corrective words with bureaucratic authority. His shoe’s latchet I am not worthy to unloose any more than my little mind can deliberate upon the magic of 2^1/2. Socrates was a devil! I can only revere the grand ordering principles of the galaxy that passeth all my understanding. Who am I to suppose causality by reason and context?

seablade · January 31, 2024, 11:37am

See my comment very early on in the thread:

What you are discussing are ways to get an approximation of RMS, when you can’t really calculate RMS of a non repetitive waveform easily. As I mentioned before, RMS is a bit of a BS term in terms of complex waveforms, and is used as a marketing term for some audio manufacturers as well (Was used for speaker ratings for a long time for instance), thankfully there is a bit of a move away from this in larger format sound systems as there were to many unknown assumptions in how such a number was calculated, but I still see it pop up.

    Seablade

lherg · January 31, 2024, 5:11pm

Thanks for your comments, I removed the factor of two from my loop to reduce algorithmic complexity. Why carry out several multiplications when just one is enough

  for (unsigned int i = 0; i < rms_buffer_size; ++i) {
    sum+= rms_buffer[i]*rms_buffer[i];
  }
  rms_dB_FS = 20*log10(sqrt(sum/rms_buffer_size) * sqrt(2.0));

lherg · January 31, 2024, 5:39pm

Seablad, these rms calculation algorithms are however used in most audio software.

Even if this does not represent a physical reality, they are very practical and inexpensive in CPU to have an overview of the acoustic power of the signal (The ears are quadratic sensors). Doing an FFT, which is done in LUFS, unless I am mistaken, is certainly interesting in the mastering phase but to monitor multiple inputs it does not seem necessary to me and above all heavy in terms of calculation. The display of peaks and and rms value (Even with an imperfect calculation) is enough for me to see if the signal requires compression, if I need to increase the gain of my preamps…

All that remains are tools that must be understood in order to master them well. In analog either the meters are not perfect, if you have a DC component these are filtered by the capacitors and you therefore do not see them in your display. Sometimes you have to use your ears to mix!!

seablade · January 31, 2024, 5:52pm

Correct my point was that in some cases there may not be a reason that can be well defined as much as that is ‘most reflective’ of what is expected.

Seablade

abyss · January 31, 2024, 7:48pm

lherg wrote:

Even if this does not represent a physical reality, they are very practical and inexpensive in CPU to have an overview of the acoustic power of the signal (The ears are quadratic sensors).

This sort of social ‘justification’ is ridiculous, like the social oneupmanship. I’ll understand my way because that is what works for me. I realize I am the outlier everywhere I go. Don’t care. I’m right for me. I cannot and am not trying to relate to the intellectuals here who, in my opinion, can’t even recognize much less articulate first principles. The first principles of this thread should be those needed to answer the original question given the ostensible function of this forum.

Here’s a first principle. The digital audio signal to be analyzed does not come with a known function f(t) or f(x) or whatever like the referential sin(t) or sin(x) does. Hence, there is no way to apply exact mathematical integral calculus. What we are doing when we sample and compute off of the digital marks that is the digital data is the approximation of an exact integral calculation that is the basic theoretical idea of integral calculus to derive the area under a curve. There is no f(x), but there are points on the otherwise unknown curve at regular intervals. The approximation is not nonsense or digital audio processing would not be amazing when done skillfully. The approximation at 192 kHz is damned great. It is integral calculus in essence. I doubt your human ears would recognize the difference from some ‘perfect’ function value we can’t determine and what we normally calculate. If close enough were not an engineering principal, a first principal, then there would not be engineering at all.

Furthermore, if someone who does that crazy sort of algorithmic manipulation with the early factor of two would just comment the code properly on the difficult stuff, this whole discussion would not be necessary. I don’t think most people know the important choices and the important details for competency from the other ones. It’s always lookie what I can do. Of Course, Robin is a very clear and insightful writer. Nevertheless, I did not understand anyone to have explained the answer. You can point to your stuff. It did not make sense to me.

First principles, people. If social rank is your first principles, go work at Boeing. There is no need to justify approximate empirical integral calculus, which is fundamental to the derivation of mathematical and exact integral calculus. Furthermore, I know what a man has between his legs, and I don’t give a damn about conventional wisdom such as exists in what I do not recognize as my heritage. I suggest to those capable, know thyself. First principles, Clarice. First principles. (Youtube video " First Principles / Simplicity by Dr Hannibal Lecter to Clarice / The Silence of the Lambs (1991)").

lherg · January 31, 2024, 8:26pm

Is this a reference to Harry Potter?

The answer has been provided many times in this discussion thread: It is in the definition of dB FS by the AES: Align the 0 dB FS with the RMS value of a sine wave of maximum amplitude.

https://en.wikipedia.org/wiki/DBFS
The unit dB FS or dBFS is defined in AES Standard AES17-1998,[13] IEC 61606,[14] and ITU-T Recs. P.381[15] and P.382,[16] such that the RMS value of a full-scale sine wave is designated 0 dB FS. This means a full-scale square wave would have an RMS value of +3 dB FS.[17][18] This convention is used in Wolfson [19] and Cirrus Logic [20] digital microphone specs, etc.

ccaudle · January 31, 2024, 10:13pm

That is a misuse of the term. The dB FS measurement is specifically defined as an RMS measurement.
AES 17-2020 section 3.12.2 and 3.12.3 have this note:
“NOTE 2 Levels reported in FS are always rms. It is invalid to use FS for non-rms levels.”

That is somewhat like saying you do not believe addition and multiplication work on a non-repeating set of numbers. RMS is just a way to calculate a particular kind of mathematical average (quadratic mean, as the Wiki page you linked points out), the math doesn’t care if the numbers are periodic or not.

In fact the wikipedia page that you linked shows both the continuous time and discrete time formulas for calculating RMS, and it is pretty clear that there is no assumption of periodicity, since there is a section specifically describing simplifications for common periodic signals.

You have to start adding fine print when using RMS with non-repetitive signals, and of course it is possible for people to misinterpret what the numbers mean.
Since it is a type of average the period over which the signal is “averaged” (using that term loosely) will affect the result, so you always should specify the time period over which the RMS is calculated.
That is the same thing which is explicit in the EBU standards for loudness metering where EBU Tech-3341 calls out momentary, short-term, and integrated loudness using essentially the same calculation but over different time periods.

That seems like one of the places where RMS is actually useful. RMS is the thermal equivalent voltage, and low frequency drivers are often thermally limited. The RMS rating of a low frequency driver should indicate the long term RMS signal level it can tolerate without overheating the voice coil due to dissipation in the wire resistance. Especially with modern speaker controllers which are fast enough to monitor instantaneous amplitude (to make sure excursion stays within limits), and RMS values at various time periods to keep track of thermal overload, it seems like you should be able to have a much better understanding of how close to the various limits the drivers are at any particular moment.

Was that because sound system vendors did not reference back to the relevant standards for measuring loudspeaker system components, or just because to many customers were not familiar with the relevant standards?

abyss · February 1, 2024, 12:54am

lherg, if that because-the-standard-says-so answer is an answer for you, then great that you can understand it and appreciate it. I do not understand or appreciate it as a solution.

An insightful solution would have been to show you the correct and easy to understand algorithm and why it is the correct algorithm and then to show how the math can be manipulated into the form you gave, which is either a stupid form or a genius form stupidly lacking documentation of its genius.

Here goes, because I like the failure to communicate what I mean:

for (unsigned int i = 0; i < rms_buffer_size; ++i) {
sum+= rms_buffer[i]*rms_buffer[i];
}

signal_rms = sqrt(sum/rms_buffer_size);

// The reference signal for the sample
// signal is a sine wave of maximum amplitude.
// The amplitude is 1 generic unit to fit the
// capacity of the signal representation,
// i.e the amplitude ranges from -1 to 1, inclusive.
//
// Spec whatever says whatever if you like (but
// do you know what it means or do you trust
// technocrats to think for you? Say yes.)…
ref_rms = 1/sqrt(2);

// The factor 20 is 2 * 10. The 10 is to convert bels
// to decibels. The 2 is to square the amplitudes/voltages
// according to the logarithmic power rule, i.e
// log(x**y) = y * log(x).
dbfs = 20*log10(signal_rms/ref_rms);

for (unsigned int i = 0; i < rms_buffer_size; ++i) {
sum+= rms_buffer[i]*rms_buffer[i];
}

// The reference signal for the sample
// signal is a sine wave of maximum amplitude.
// The amplitude is 1 generic unit to fit the
// capacity of the signal representation,
// i.e the amplitude ranges from -1 to 1, inclusive.
//
// The factor 20 is 2 * 10. The 10 is to convert
// bels to decibels. The 2 is to square the amplitudes/voltages
// according to the logarithmic power rule, i.e
// log(x**y) = y * log(x).
//
// Spec whatever says whatever if you like (but
// do you know what it means or do you trust
// technocrats to think for you? Say yes.)…

amplitude_quotient = sqrt(sum/rms_buffer_size) * sqrt(2);
dbfs = 20*log10(amplitude_quotient);

for (unsigned int i = 0; i < rms_buffer_size; ++i) {
sum+= rms_buffer[i]*rms_buffer[i];
}

// The reference signal for the sample
// signal is a sine wave of maximum amplitude.
// The amplitude is 1 generic unit to fit the
// capacity of the signal representation
// i.e the amplitude ranges from -1 to 1, inclusive.
//
// The factor 20 is 2 * 10. The 10 is to convert
// bels to decibels. The 2 is to square the amplitudes/voltages
// according to the logarithmic power rule, i.e
// log(x**y) = y * log(x).
//
// Spec whatever says whatever if you like (but
// do you know what it means or do you trust
// technocrats to think for you? Say yes.)…

amplitude_quotient = sqrt(sum/rms_buffer_size*2);
dbfs = 20*log10(amplitude_quotient);

for (unsigned int i = 0; i < rms_buffer_size; ++i) {
sum+= rms_buffer[i]*rms_buffer[i];
}

// Spec whatever says whatever if you like (but
// do you know what it means or do you trust
// technocrats to think for you? Say yes.)…
//
// Code like a beginner Perl programmer.
sum *= 2;
amplitude_quotient = sqrt(sum/rms_buffer_size);
dbfs = 20*log10(amplitude_quotient);

for (unsigned int i = 0; i < rms_buffer_size; ++i) {
// Be the greatest Perl programmer in history
// after Larry Wall himself. That’s why the
// factor of two is pre-included.
rms_titanic_sum += 2*rms_buffer[i]*rms_buffer[i];
}

// Love me some clean and understandable variable names.
sum = rms_titanic_sum;

// Spec whatever says do whatever and that’s what
// this code does step by step, lmao.

rms_titanic = sqrt(sum/rms_buffer_size);
rms_titanic_db_special = 20*log10(rms_titanic);

lherg · February 1, 2024, 7:04am

This must be in the first movie, when Hermione meets the troll in the toilet

It’s just a question of standardization, the dBFS is normalized to this reference, another could be chosen, a full amplitude square wave, the effective value of a maximum sinusoide…whatever, we had to choose one. This only shifts the reference value in the dB conversion. the important thing is that all software or hardware that uses this standard is based on the same reference level.

PS:
The advantage of this choice is that the peak of a 0 dB FS sine wave will also be 0. This therefore avoids confusion when you align inputs with outputs, since in general you use a sine wave signal for that. .If the peak rms value were different, we would necessarily ask ourselves whether we should align the input according to the peak or the rms…

seablade · February 1, 2024, 12:19pm

Again keeping in mind I only glanced over the ITU document that started this, what it seems to be saying is that when they say dBFS they are referencing the RMS value that would have a peak value of 0dBFS. I suspect we are saying the same thing, and you could be correct in that it is a misuse of the terminology as defined by AES, but I cannot think of a better way to describe it in this context at the moment, open to suggestions?

I will agree with all of this. In the end my point is that assumptions are made in RMS measurements in general.

Among other things, it was because what signal are you defining RMS in? What are the assumptions about measurement window, frequency content, etc. that are made. These were not standardized (And really still aren’t) across manufacturers and the assumptions that most benefited the manufacturer in question was often used in the marketing material including spec sheets.

Keep in mind almost all large format sound systems are multiple driver systems, and what would happen is that manufacturer marketing departments would advertise the ‘RMS value’ of a signal that most benefited their marketing (SPL, etc.). So if a 1k sine wave generated the highest output SPL off a 1W RMS input to measure sensitivity, they would advertise this and utilize the RMS measurement and label it as RMS. Same thing applies to peak output, etc. There is a huge difference between outputting a single loud sine wave, and a full bandwidth signal evenly and clearly, and that drastically affects the actual performance of the speaker that is often made up of multiple drivers to handle different frequency ranges. RMS on it’s own was far to ‘generic’ a term and the assumptions made were not often clearly stated.

There is a bit of a push now to other ways of referencing input signal, for instance Meyer developed the M-Noise input as even pink noise was a bit generic and didn’t reflect the actual usage of speakers, but 1W RMS of M-Noise vs Pink Noise would result in different speaker cabinet performance (Assuming all other assumptions being equal) as it would utilize the drivers in different ratios of power distribution. Like wise there is a push to define continuous, average, etc. in better terms and clearly and explicitly state those assumptions (Length of time of measurement, etc.) in the terms rather than just labeling it as ‘RMS’ without labeling those assumptions.

Could you utilize RMS? Yes, but as you mentioned you have to clearly define how it was measured in all regards, not something that many manufacturers did, and less that users understood completely when they competed. So instead we are getting a push to better define this and hopefully move to a more common set of standards utilized by all reputable manufacturers rather than taking the set of assumptions they like and labeling it RMS on a sheet.

   Seablade

abyss · February 2, 2024, 3:36am

@Robin, I’d like your opinion on this statement by lherg:

The advantage of this choice [max sized sine wave for dBFS] is that the peak of a 0 dB FS sine wave will also be 0.

It that true? I thought the range of amplitude was 1 to -1. I would expect zero amplitude to be in the middle. Zero dBFS refers to (nearly?) maximum power not maximum amplitude, as I understand it. Does the power peak at zero? I don’t want to argue in circles any more. Many of the putative key statements here make no sense to me. This one I can grok a bit. I just want to know if that statement I just quoted makes sense to you. Thanks in advance, and if you don’t see this request any time soon, that’s okay. I know you are usually around, and if you happen by, great.