Calculating RMS in digital audio

L_Pro · January 29, 2024, 5:33pm

Hi,
I think that when Robin says…

It should be…
log(sqrt(2 * x)) = 0.5 * log(2*x)

abyss · January 29, 2024, 5:51pm

I too was confused about RMS. I have a Western philosopher’s mind. This is an extract from my notes from my research.

Audio Signal Power and Loudness

Electrical power = voltage**2 / electrical resistance

DeciBels are 1/10 of a Bel, which are a logarithmic
relative proportional difference according to the formula
log10( Power A / Power B)
= [by power proportion] log10( Voltage A ** 2 / Voltage B ** 2 )
= [by log rule transform] 2 * log10( Voltage A / Voltage B ).

Because signal voltage is easy to determine, decibles are typically
calculated as 10 * 2 * log10( Voltage A / Voltage B ).

As discovered by Fourier, all periodic waves can be represented by
combinations of sine waves. Sine waves are essentually atomic waves
as far as I can tell. In theory, if all the sine waves are known, they
can be mathematically evaluated individually and summed or something like that. It seems a graphical approach on the random digital signal is used, i.e. the almost calculus is done at the sampling rate of the audio signal.

What is the absolute average voltage of a sine wave?
Circumferance = 2 * pi radians (radian arc segment equals radius).
Radians make integral math easiest.
Sine wave equation: y = sin( r * t ), where r is the circle radius or
maximum amplitude.
Area under one curve: S[integral] from t=0 to t=pi/r of sin( r * t ) dt.
Derivative of sin(x) is cos(x).
Derivative of sin(rt) is cos(rt) * r
Use substitution. Let u = rt and du = r dt.
Get: S[integral] of sin(u) * 1/r * du = 1/r * S[integral] of sin(u) du
= difference of -cos(rt) / r at each limit
= -cos( pi )/r - -cos( 0 )/r
= 1/r + 1/r = 2/r
Average height = area under one curve / (pi/r) = (2r)/(pi*r) = 2/pi
If the sine wave is scaled by a multiplicative constant, i.e. VMax * sin (rt), then the constant simple moves unchanged out of the integral to get an area under one curve of VMax * 2 / r and an average amplitude/voltage of VMax * 2 / pi.

However, audio engineers do NOT care about average voltage or amplitude. They care about the voltage of average power because power correlates to loudness, which humans perceive on an approximately logarithmic scale. Humans also perceive brightness in an approximately logarithmic scale; hence, gamma encoding and decoding.

Average power is proportional to the voltage squared over the time of
a sine wave, and the voltage of average power is the square root
of the area under the curve of the voltage squared over the time of
one curve of its associated sine wave.

Area under one curve: S[integral] from t=0 to t=pi/r of (sin( r * t ))**2 dt.
Derivative of x ** n = n * x ** (n-1).
Antiderivative of x ** n is 1/(n+1) * x ** (n+1).
Use substitution. Let u = rt and du = r dt.
Get: S[integral] of (sin(u))2 * 1/r * du.
Use substitution. Let v = sin(u) and dv = cos(u) du = sin(v - pi/2) du.
Get: S[integral] of v2 * 1/r * 1/sin(v - pi/2) * dv.
NOT SURE HOW TO PROCEED

However, it is asserted that the voltage of average power = VMax / (2 ** 1/2). The voltage of average power is called the root-mean-square voltage, which is the graphical method of almost calculus calculation in reverse order. Each voltage sample is squared, a sequence of squared voltages are added and divided by their number to get a mean/average, and that mean is lowered to the 1/2 power or input into the square root function.

Loudness Units Full Scale (LUFS) are defined by a sample processing algorithm derived from DBFS that is typed according to duration of the sample period: Momentary LUFS, Short-Term LUFS, and Integrated LUFS.

lherg · January 29, 2024, 7:17pm

My initial question was: where does this factor of 2 come from in the calculation of the RMS value on the dB FS scale. I will therefore stick to endolith’s response in this post cited above.

This definition of dBFS is explicitly designed such that the dBFS value of a full-scale sine wave equals 0 (and in consequence, that of a full-scale square wave is +3 dBFS).
Since the RMS of the full-scale sine wave is 1/sqrt(2), multiplying rms(signal) by sqrt(2) ensures that the formula evaluates to 0 for the full scale sine wave: *
20log10(rms(signal) * sqrt(2)) = 20log10((1/sqrt(2)) * sqrt(2)) = 20log10(1) = 0*

It can be found in point 3.4 of this AES document.

It is therefore in the definition of dB FS.

abyss · January 30, 2024, 10:34pm

@lherg, You said:

My initial question was: where does this factor of 2 come from in the calculation of the RMS value on the dB FS scale.

I am no expert, but I don’t think the relative decibel scale position matters for the use of the factor of 2 with ‘power’ samples. The critical point is that LOUDNESS is proportional to ELECTRICAL POWER is proportional to VOLTAGE * VOLTAGE, which is the AMPLITUDE * AMPLITUDE.

Bels and Decibels are on a logarithmic scale and relate to relative POWER as POWER A divided by POWER B but input into a log base 10 function to make the relative POWER logarithmic relative POWER.

POWER is not easy to measure, but AMPLITUDE that is VOLTAGE is easy to measure. The relative VOLTAGES squared, divided against each other, and converted to a logarithmic scale is the same as POWER and LOUDNESS in Bels or Decibels because the ELECTRICAL RESISTANCE (in Ohms) cancels out when the division of one POWER squared by another POWER squared.

The extra multiplication of the squared AMPLITUDE or VOLTAGE is to make the VOLTAGE squared before the division before the log function. But with the sampling of VOLTAGES over time, the VOLTAGE sample is both squared and multiplied by 2. That’s weird in my non-expert opinion.

The sampling of (relative?) POWER over time (like area under the curve in Calculus) requires the sampling of VOLTAGE ** 2 (with constant factor per RESISTANCE not specified, but decibels are relative to some reference POWER value of the same resistance and that should cancel out with any decibel value).

The RMS VOLTAGE is the VOLTAGE of average POWER. the RMS VOLTAGE **2 / RESISTANCE = AVERAGE POWER.

If you draw a sine wave and also draw a horizontal line of VOLTAGE = AVERAGE VOLTAGE, the area under the curve from any 0-VOLTAGE intersect to any 0-VOLTAGE intersect (that horizontal line of zero volts in the middle of the sine wave and the area under the line of AVERAGE VOLTAGE are the same.

If you draw a modified wave that is the square of the sine wave, you have a power wave. I think that if you draw a horizontal line at RMS VOLTAGE * RMS VOLTAGE, the areas would be equal because the area under the curve represents POWER and LOUDNESS.

The area under a curve is part of the Fundamental Theorem of Calculus. The idea of approximating the area under (or over) a smooth curve with vertical slivers of the same width is not difficult. Make the number of slivers infinite, and the approximation goes to the exact value (if, I suppose, everything is ‘well defined’).

After that, you need to understand what curve. Use the POWER curve not the VOLTAGE curve.

I don’t see a factor of 2 in the integral expression. The factor of 2 is for the Bel calculation between to unsquared VOLTAGES because it is logarithmic.

I am not convinced the times 2 is needed, but what do I know?

for (unsigned int i = 0; i < rms_buffer_size; ++i) {
           sum+= 2.0*rms_buffer[i]*rms_buffer[i];
}
rms = sqrt(sum/rms_buffer_size);
rms_db = 20*log10(rms);

Why not just multiply the final sum by two. The computation is repetitive. How do you calculate RMS decibels without a reference VOLTAGE in the denominator? Apparently, it’s the value 1 volt or at least 1 (suggests full scale to me going by what Robin said next). If all several open source calculations include the factor of 2, then I don’t know where the misunderstanding is, but I know the factor of 2 does not fit my understanding of the theory of integral calculus for relative electrical POWER, and the webpage “Calculating the power of a signal”, linked above, is consistent with my understanding. I think it’s spurious, but I could be wrong.

x42 · January 30, 2024, 11:07pm

Key is dBFS - decibel relative to Full Scale.

Digital audio signal level is represented as floating point value in the range [-1 … +1] where an absolute value of 1.0 is the max possible amplitude (values above that will be clipped).

This signal level corresponds to Voltage.

Yep, and since the rms_sum value is always positive it can be even taken out of the sqrt() or the log().

Assuming constant impedance:

Power ratio = 10 log (P1/P2) = 10 log (V1/V2)^2 = 20 * log (V1/V2)

abyss · January 30, 2024, 11:17pm

@Robin, just to be clear for everyone, I think you mean V1^2/V2^2 and not the square of the log. This makes no sense to me:

Yep, and since the rms_sum value is always positive it can be even taken out of the sqrt() or the log()

RMS is the procedure in reverse order: (1) sum, (2) divide by the count, (3) take the square root. Sample values are summed first and can’t be taken out. The final RMS voltage can me manipulated, I suppose. I am just expressing my understanding not giving a definitive opinion. I don’t have one and don’t wish to do the research and study to get one. Maybe my comments are useful, but they come with no warranty.

x42 · January 30, 2024, 11:20pm

indeed. I should have added a bracket 10 log ( (V1/V2)^2 )

(though log has a lower precedence and square of the log would be notated log^2 (…))

abyss · January 30, 2024, 11:27pm

Ah. I needed the baby steps, Robin: (X * X)/(Y * Y) = (X/Y)*(X/Y).

ccaudle · January 30, 2024, 11:38pm

No, the audio measurement standard AES-17 is explicit that db FS is defined for amplitude, and is defined as 20log(signal_rms/full_scale_sine_rms).

Any measurement given in dB is a measurement of the signal relative to a reference value. The reference value for full scale in this case is a sine wave which just reaches full scale. As lherg pointed out the RMS value of a sine wave is 1/sqrt(2). Dividing by 1/sqrt(2) is of course the same as multiplying by sqrt(2), so you can re-write as 20log(signal_rms*sqrt2).

The factor of 2 pointed out in the first post is just algebraic simplification of where the “divide by 1/sqrt(w))” factor is used in the calculations.

x42 · January 30, 2024, 11:38pm

Now for the practical part.

One advantage of moving this factor of 2 inside the log as sqrt(2) is that the (sum/rms_buffer_size) will also have a range of [0…1]. That value that can be easily transmitted reliably (no -inf for silence) .

Furthermore a common function coefficient_to_db(v) → 20 log(v) van be used throughout the codebase.

x42 · January 30, 2024, 11:49pm

you guys are saying exactly the same thing

ccaudle · January 31, 2024, 12:34am

Yes, but I find the derivation of the 10log(power) vs 20log(amplitude) obscures the point that dB values always have a reference, and in this case the reference is explicitly the RMS of the amplitude of a full scale sine, so you need to divide the signal value by the reference sine value and take the log of that.

x42 · January 31, 2024, 1:19am

One of us has this backwards

From a physics point of view, the reference is signal power ratio, and only by applying ohms law, you can derive the log of squared voltage ratios (if R is constant). This way you actually calculate the RMS of a sine wave to be 1/sqrt(2).

And only then can the AES come along and tell you to put sqrt(2) in there to normalize it.

In other cases however it’s not as simple. e.g. for LUFS calculation you have to actually integrate and cannot just move a constant normalization factor into the log(). In LUFS loudness DSP you’ll find 10 log (…) for that reason (page 3 https://www.itu.int/dms_pubrec/itu-r/rec/bs/R-REC-BS.1770-5-202311-I!!PDF-E.pdf).

–

All that being said, for simple RMS, I agree that from an engineering point of view the AES definition is easier to grasp for casual people implementing it.

abyss · January 31, 2024, 3:18am

I will stoke the flames of confusion.

for (unsigned int i = 0; i < rms_buffer_size; ++i) {
           sum+= 2.0*rms_buffer[i]*rms_buffer[i];
}
rms = sqrt(sum/rms_buffer_size);
rms_db = 20*log10(rms);

So let’s consider a mathematical transformation (if I did it right):

for (unsigned int i = 0; i < rms_buffer_size; ++i) {
           true_sum+= rms_buffer[i]*rms_buffer[i];
}
rms = sqrt(2) * sqrt(true_sum/rms_buffer_size);
rms_db = 20*log10(rms);

What is the sqrt(2) for? I am not familiar with AES-17, but even if I were, does it make clear what the sqrt(2) is for? As Robin said, I believe, it is a factor one can apply TO A SINE WAVE’s maximum amplitude to get the RMS amplitude.

Now look at the algorithm. Are we sampling a sine wave? No, we are not. If we have a sine wave, we use exact math in the style of integration of calculus. Audio signals are much more irregular. I don’t think one may ‘normalize’ or convert from maximum amplitude to RMS amplitude with the constant sqrt(2) when one is sampling some generic audio signal.

Others may be a by-the-book person if they wish. I’ve read enough books and don’t read academic stuff any more than necessary. The question as I understand it was one of understanding the why of how it works or would work correctly, not in how to follow an algorithm given from on high. I realize all too well I am the oddball that way. I am not going to assess AES anything, so of course my opinion is just some questions that I think are relevant to a solid understanding of the original question. Good luck to everyone implementing audio signal processing.

But now I realize something. This from Chris is interesting:

No, the audio measurement standard AES-17 is explicit that db FS is defined for amplitude, and is defined as 20log(signal_rms/full_scale_sine_rms).

What if we let full_scale_sine_rms = 1 / sqrt(2) ?
Then we get 20log(sqrt(2)*signal_rms) or:

rms = sqrt(2) * sqrt(true_sum/rms_buffer_size);
rms_db = 20*log10(rms);

READ THIS, lherg:
It would seem the repetitive ‘2’ is the unclear forward reference of the reference RMS amplitude of a maximum sized sine wave.

ccaudle · January 31, 2024, 4:09am

A value in dB is not absolute, it is the logarithm of the ratio of a measured value to a reference value.
In the case of FS measurement, the “FS” referes to full-scale digital, and dB FS is defined to be the logarithm of the RMS signal value to the RMS value of a sine wave which has a maximum amplitude of the maximum digital value (the maximum numeric value will vary depending on how many bits the audio samples contain).

Not quite. When you calculate the RMS value of a sine wave with peak amplitude of +1 and -1 the value is 1/sqrt(2).
That is the reference value, so when you calculate the ratio of the measured signal RMS value to the reference sine wave RMS value you calculate signal_rms/1/sqrt(2).
When you normalize that fraction you would write it as signal_rms*sqrt(2).

You are not converting max amplitude to RMS of the audio signal, you are calculating the ratio of the signal RMS value to the RMS value of a maximum amplitude sine wave.

When you calculate dB values it is always a comparison to a reference value. dB FS happens to use the RMS amplitude of a full scale sine wave as the reference, dBm uses 1mWatt as the reference, dBu uses the voltage which dissipates 1mW into 600 Ohms as the reference, dB SPL is the sound pressure level referenced to 20 micro pascals. There always has to be a reference value against which you are measuring.

Well, yes, that is explicitly what I wrote in my previous comment.

abyss · January 31, 2024, 9:36am

2^1/2 is indeed a magical number of the RMS gods. RMS is an energy field created by all living digital audio sources. It surrounds us and penetrates us; it binds the DAW galaxy together. Those who can channel that energy are luminous beings, not crude matter such as myself. Its high priest knows the holy specs and vouchsafes to provide many corrective words with bureaucratic authority. His shoe’s latchet I am not worthy to unloose any more than my little mind can deliberate upon the magic of 2^1/2. Socrates was a devil! I can only revere the grand ordering principles of the galaxy that passeth all my understanding. Who am I to suppose causality by reason and context?

seablade · January 31, 2024, 11:37am

See my comment very early on in the thread:

What you are discussing are ways to get an approximation of RMS, when you can’t really calculate RMS of a non repetitive waveform easily. As I mentioned before, RMS is a bit of a BS term in terms of complex waveforms, and is used as a marketing term for some audio manufacturers as well (Was used for speaker ratings for a long time for instance), thankfully there is a bit of a move away from this in larger format sound systems as there were to many unknown assumptions in how such a number was calculated, but I still see it pop up.

    Seablade

lherg · January 31, 2024, 5:11pm

Thanks for your comments, I removed the factor of two from my loop to reduce algorithmic complexity. Why carry out several multiplications when just one is enough

  for (unsigned int i = 0; i < rms_buffer_size; ++i) {
    sum+= rms_buffer[i]*rms_buffer[i];
  }
  rms_dB_FS = 20*log10(sqrt(sum/rms_buffer_size) * sqrt(2.0));

lherg · January 31, 2024, 5:39pm

Seablad, these rms calculation algorithms are however used in most audio software.

Even if this does not represent a physical reality, they are very practical and inexpensive in CPU to have an overview of the acoustic power of the signal (The ears are quadratic sensors). Doing an FFT, which is done in LUFS, unless I am mistaken, is certainly interesting in the mastering phase but to monitor multiple inputs it does not seem necessary to me and above all heavy in terms of calculation. The display of peaks and and rms value (Even with an imperfect calculation) is enough for me to see if the signal requires compression, if I need to increase the gain of my preamps…

All that remains are tools that must be understood in order to master them well. In analog either the meters are not perfect, if you have a DC component these are filtered by the capacitors and you therefore do not see them in your display. Sometimes you have to use your ears to mix!!

seablade · January 31, 2024, 5:52pm

Correct my point was that in some cases there may not be a reason that can be well defined as much as that is ‘most reflective’ of what is expected.

Seablade