Calculating RMS in digital audio

Hello everyone,
I’m trying to understand how rms audio levels are calculated and I’ve been stuck for several days on this calculation in most audio software (Ardour, jkmeters, x42.meters, Adobe audition). When I calculate the rms value of a sine wave with a peak at 0 dBFS (Max amp = 1.0f) myself, I get -3 dB, which is consistent with the definition of rms: rms/ square root of two for a sine wave.

But in Ardour (And in all software I tested), I have 0 dB. Why ?

When I look at the calculation of the rms algorithm of open source software, it looks like this:
Sum of (squares of the amplitude of the samples) multiplied by 2

for (unsigned int i = 0; i < rms_buffer_size; ++i) {
           sum+= 2.0*rms_buffer[i]*rms_buffer[i];
rms = sqrt(sum/rms_buffer_size);
rms_db = 20*log10(rms);

Why double the amplitude of each sample*sample?

Thanks in advance

I don’t know jack about programming… but is it because its a stereo signal? Therefore the calculation you have on the L and the R, L+R= 2(-3dB)=0dB?

1 Like

Hello Saam, thank you for your response. I used a mono signal to do my testing, so I don’t think that’s it.

If I refer to this ITU document, page 9:
"All output signal levels specified in this clause are relative to decibels relative to full scale (dBFS),
where 0 dBFS represents the root mean square (RMS) level of a full-scale sinusoidal signal"

If I understand correctly a sinusoidal at maximum level (max sample = 1) must have an RMS level of 0 dBFS.

So the RMS calculation that I put in my initial post is consistent with this definition, but I don’t understand what we are monitoring, it is no longer the amplitude…

1 Like

Keeping in mind, I just took a quick glance at that document:

No it must have a peak level of 0dBFS. In that document (Which really only covers the analog IO of mobile telephones for the most part) that are saying that levels reference will be relative to the RMS valus of a sine wave with peak of 0dBFS. This is not really the same as RMS value of 0dBFS, which would result in peaks of greater than 0dBFS, and significant distortion as a result as 0dBFS is the greatest value you can have without clipping.

By the way, in actuality I believe you cannot calculate the RMS of a complex audio waveform in a simple fashion as you would need to break down the complex waveform to each of it’s components, which are constantly changing for audio, and then calculate the RMS of each components frequency and add them together. What measurements you see that are “RMS” of a complex waveform are either a bit of BS (As they take some assumptions to get the value) or just completely mislabeled as some have taken RMS to mean ‘average’. Worth a read here:


Usually that factor would not be factored in the loop, but just once on the result. That saves computation.

Alternatively you can also just factor it out completely and use 10 * log (rms_sum) for signal power (and 20 * log(peak) for amplitude).

1 Like

As for the factor two: see the definition of dB (Decibel - Wikipedia)

RMS is signal-power (dB represent power ratio), while digital-peak is a amplitude (dB is amplitude ratio).

A factor of two in amplitude is ~6dB, while a factor two of signal power is about 3dB.

1 Like

Thank you for your answers. I found an interesting thread here.

So I’m going to correct my algorithm like this:

for (unsigned int i = 0; i < rms_buffer_size; ++i) {
           sum+= rms_buffer[i]*rms_buffer[i];
rms = sqrt(sum/rms_buffer_size);
rms_dbfs = 20*log10(rms * sqrt(2.0));

= 10 * log10 (rms * 2)

(see above Calculating RMS in digital audio - #5 by x42)

I haven’t looked at my logarithms for a long time but it seems to me that:
Hard for the end of the weekend… I’m starting to have a headache!!! :wink:

The output of my program for a sine wave of maximum amplitude :

20log10(rms * sqrt(2.0)) : -0,041925
log10(rms * rms * 2.0) : -0,041925
10*log10(rms * 2.0) : 1,484188

Thanks for all.

1 Like
rms = \sqrt{\frac{1}{n}\sum_{n}{sig_n^2}}

and then convert to to dB by taking the log10() of it.
For signal power relative to full scale the convention is 10 * log10(rms)

PS. Note that this is identical to the calculation in very first post; just the factor of two is moved outside: log(sqrt(2 * x)) = 0.5 * log(sqrt(x))

Thanks, I think I’ll have to learn the conventions used by audio developers…

It’s not the same:

ln (x^n) == n ln (x) , and here n = 1/2
for base 10, log(x^n) = n log (n) / log(10), but log(10) is one.

I think that when Robin says…

It should be…
log(sqrt(2 * x)) = 0.5 * log(2*x)

I too was confused about RMS. I have a Western philosopher’s mind. This is an extract from my notes from my research.

Audio Signal Power and Loudness

Electrical power = voltage**2 / electrical resistance

DeciBels are 1/10 of a Bel, which are a logarithmic
relative proportional difference according to the formula
log10( Power A / Power B)
= [by power proportion] log10( Voltage A ** 2 / Voltage B ** 2 )
= [by log rule transform] 2 * log10( Voltage A / Voltage B ).

Because signal voltage is easy to determine, decibles are typically
calculated as 10 * 2 * log10( Voltage A / Voltage B ).

As discovered by Fourier, all periodic waves can be represented by
combinations of sine waves. Sine waves are essentually atomic waves
as far as I can tell. In theory, if all the sine waves are known, they
can be mathematically evaluated individually and summed or something like that. It seems a graphical approach on the random digital signal is used, i.e. the almost calculus is done at the sampling rate of the audio signal.

What is the absolute average voltage of a sine wave?
Circumferance = 2 * pi radians (radian arc segment equals radius).
Radians make integral math easiest.
Sine wave equation: y = sin( r * t ), where r is the circle radius or
maximum amplitude.
Area under one curve: S[integral] from t=0 to t=pi/r of sin( r * t ) dt.
Derivative of sin(x) is cos(x).
Derivative of sin(rt) is cos(rt) * r
Use substitution. Let u = rt and du = r dt.
Get: S[integral] of sin(u) * 1/r * du = 1/r * S[integral] of sin(u) du
= difference of -cos(rt) / r at each limit
= -cos( pi )/r - -cos( 0 )/r
= 1/r + 1/r = 2/r
Average height = area under one curve / (pi/r) = (2r)/(pi*r) = 2/pi
If the sine wave is scaled by a multiplicative constant, i.e. VMax * sin (rt), then the constant simple moves unchanged out of the integral to get an area under one curve of VMax * 2 / r and an average amplitude/voltage of VMax * 2 / pi.

However, audio engineers do NOT care about average voltage or amplitude. They care about the voltage of average power because power correlates to loudness, which humans perceive on an approximately logarithmic scale. Humans also perceive brightness in an approximately logarithmic scale; hence, gamma encoding and decoding.

Average power is proportional to the voltage squared over the time of
a sine wave, and the voltage of average power is the square root
of the area under the curve of the voltage squared over the time of
one curve of its associated sine wave.

Area under one curve: S[integral] from t=0 to t=pi/r of (sin( r * t ))**2 dt.
Derivative of x ** n = n * x ** (n-1).
Antiderivative of x ** n is 1/(n+1) * x ** (n+1).
Use substitution. Let u = rt and du = r dt.
Get: S[integral] of (sin(u))2 * 1/r * du.
Use substitution. Let v = sin(u) and dv = cos(u) du = sin(v - pi/2) du.
Get: S[integral] of v
2 * 1/r * 1/sin(v - pi/2) * dv.

However, it is asserted that the voltage of average power = VMax / (2 ** 1/2). The voltage of average power is called the root-mean-square voltage, which is the graphical method of almost calculus calculation in reverse order. Each voltage sample is squared, a sequence of squared voltages are added and divided by their number to get a mean/average, and that mean is lowered to the 1/2 power or input into the square root function.

Loudness Units Full Scale (LUFS) are defined by a sample processing algorithm derived from DBFS that is typed according to duration of the sample period: Momentary LUFS, Short-Term LUFS, and Integrated LUFS.

1 Like

My initial question was: where does this factor of 2 come from in the calculation of the RMS value on the dB FS scale. I will therefore stick to endolith’s response in this post cited above.

This definition of dBFS is explicitly designed such that the dBFS value of a full-scale sine wave equals 0 (and in consequence, that of a full-scale square wave is +3 dBFS).
Since the RMS of the full-scale sine wave is 1/sqrt(2), multiplying rms(signal) by sqrt(2) ensures that the formula evaluates to 0 for the full scale sine wave: *
20log10(rms(signal) * sqrt(2)) = 20
log10((1/sqrt(2)) * sqrt(2)) = 20
log10(1) = 0*

It can be found in point 3.4 of this AES document.

It is therefore in the definition of dB FS.

1 Like

@lherg, You said:

My initial question was: where does this factor of 2 come from in the calculation of the RMS value on the dB FS scale.

I am no expert, but I don’t think the relative decibel scale position matters for the use of the factor of 2 with ‘power’ samples. The critical point is that LOUDNESS is proportional to ELECTRICAL POWER is proportional to VOLTAGE * VOLTAGE, which is the AMPLITUDE * AMPLITUDE.

Bels and Decibels are on a logarithmic scale and relate to relative POWER as POWER A divided by POWER B but input into a log base 10 function to make the relative POWER logarithmic relative POWER.

POWER is not easy to measure, but AMPLITUDE that is VOLTAGE is easy to measure. The relative VOLTAGES squared, divided against each other, and converted to a logarithmic scale is the same as POWER and LOUDNESS in Bels or Decibels because the ELECTRICAL RESISTANCE (in Ohms) cancels out when the division of one POWER squared by another POWER squared.

The extra multiplication of the squared AMPLITUDE or VOLTAGE is to make the VOLTAGE squared before the division before the log function. But with the sampling of VOLTAGES over time, the VOLTAGE sample is both squared and multiplied by 2. That’s weird in my non-expert opinion.

The sampling of (relative?) POWER over time (like area under the curve in Calculus) requires the sampling of VOLTAGE ** 2 (with constant factor per RESISTANCE not specified, but decibels are relative to some reference POWER value of the same resistance and that should cancel out with any decibel value).


If you draw a sine wave and also draw a horizontal line of VOLTAGE = AVERAGE VOLTAGE, the area under the curve from any 0-VOLTAGE intersect to any 0-VOLTAGE intersect (that horizontal line of zero volts in the middle of the sine wave and the area under the line of AVERAGE VOLTAGE are the same.

If you draw a modified wave that is the square of the sine wave, you have a power wave. I think that if you draw a horizontal line at RMS VOLTAGE * RMS VOLTAGE, the areas would be equal because the area under the curve represents POWER and LOUDNESS.

The area under a curve is part of the Fundamental Theorem of Calculus. The idea of approximating the area under (or over) a smooth curve with vertical slivers of the same width is not difficult. Make the number of slivers infinite, and the approximation goes to the exact value (if, I suppose, everything is ‘well defined’).

After that, you need to understand what curve. Use the POWER curve not the VOLTAGE curve.

I don’t see a factor of 2 in the integral expression. The factor of 2 is for the Bel calculation between to unsquared VOLTAGES because it is logarithmic.

I am not convinced the times 2 is needed, but what do I know?

for (unsigned int i = 0; i < rms_buffer_size; ++i) {
           sum+= 2.0*rms_buffer[i]*rms_buffer[i];
rms = sqrt(sum/rms_buffer_size);
rms_db = 20*log10(rms);

Why not just multiply the final sum by two. The computation is repetitive. How do you calculate RMS decibels without a reference VOLTAGE in the denominator? Apparently, it’s the value 1 volt or at least 1 (suggests full scale to me going by what Robin said next). If all several open source calculations include the factor of 2, then I don’t know where the misunderstanding is, but I know the factor of 2 does not fit my understanding of the theory of integral calculus for relative electrical POWER, and the webpage “Calculating the power of a signal”, linked above, is consistent with my understanding. I think it’s spurious, but I could be wrong.

1 Like

Key is dBFS - decibel relative to Full Scale.

Digital audio signal level is represented as floating point value in the range [-1 … +1] where an absolute value of 1.0 is the max possible amplitude (values above that will be clipped).

This signal level corresponds to Voltage.

Yep, and since the rms_sum value is always positive it can be even taken out of the sqrt() or the log().

Assuming constant impedance:

Power ratio = 10 log (P1/P2) = 10 log (V1/V2)^2 = 20 * log (V1/V2)

@Robin, just to be clear for everyone, I think you mean V1^2/V2^2 and not the square of the log. This makes no sense to me:

Yep, and since the rms_sum value is always positive it can be even taken out of the sqrt() or the log()

RMS is the procedure in reverse order: (1) sum, (2) divide by the count, (3) take the square root. Sample values are summed first and can’t be taken out. The final RMS voltage can me manipulated, I suppose. I am just expressing my understanding not giving a definitive opinion. I don’t have one and don’t wish to do the research and study to get one. Maybe my comments are useful, but they come with no warranty.

1 Like

indeed. I should have added a bracket 10 log ( (V1/V2)^2 )

(though log has a lower precedence and square of the log would be notated log^2 (…))