FLAC, OGG, MP3, CDs- setting recording, mixing, compression and printing levels

seablade · December 1, 2009, 8:28pm

There are some concepts in there you are confusing I believe, this may get somewhat technical however and hopefully I won’t type out an entire book in itself…

Normalization in an ideal digital audio system is only applying gain. That means you are mathematically increasing every value in a 16,24,32, etc. bit signal by a set amount. Talking in floating point systems your range of values will typically be between 1.0 and -1.0. If your audio signal peaks at .5 and -.5, and you are normalizing it, you are taking the entire signal and increasing all of it by an identical amount so that the point in time that did reach .5 or -.5 before now reach 1.0 and -1.0 (Simplifying it slightly). The added effect this has is that you are increasing the quietest parts of the signal, which assuming you are talking about real world recordings(And even if you aren’t in many cases) will NOT be equal to 0, but instead minutely above or below zero. In increasing the level of the entire signal, you are also going to increase that minute level, otherwise known as the noise floor, to compensate.

In a 16 Bit signal, that noise floor by default can only be so quiet, so normalizing would create a very noticeable increase in the signal. This is why when recording at 16 bit you need to make sure you are recording with levels just below peak. In a 24 Bit signal, it is possible for that quiet value to be much better defined and much quieter, so normalizing the signal does not increase the noise floor in as noticeable way, thus allowing for people to get more usable signal out of a signal that was recorded with more headroom. I really don’t want to dwell on this topic quite as much as I probably should as it is a conversation in itself.

So when people say normalization adds to the noise, they are technically correct in that it will increase the noise floor. However the actual amount of signal compared to noise will not change, and in actuality because of this, there is no additional noise or distortion introduced in the process.

Now all this aside…

When dealing with algorithmically generating or modifying sound, you can end up with values that cannot be recorded precisely in a fixed bit depth. For instance multiply .12 by 1.2 and you will end up with .144, which actually needs more precision than either of those values. This tends to be the largest issue when dealing with actual processing more advanced than a simple multiplication, aka reverb DSP etc. You can always increase the bit depth of the saved value, and in fact many systems use a much larger data path internally than the source might traditionally need, Ardour included(Digital Audio only gets recorded at about 20 bits by the best AD converters to my knowledge, Ardour IIRC uses a 32 Bit data path in its mix engine, Harrison I believe uses 64 Bit data paths in its large format consoles). This allows for a tremendous amount of precision for the results to grow into. However you can still lose a slight(Literally, inaudible) amount of audio depending on the processing. So yes in those cases you will suffer a slight degredation of the signal IF YOU ARE DEALING WITH A DESTRUCTIVE PROCESS THAT DOES NOT ACCOUNT FOR THIS BY PROVIDING AN INCREASED BIT DEPTH TO ACCOUNT FOR THE ADDITIONAL NEEDED PRECISION. Otherwise you will always have the source signal that will be unaffected.

Ok all this said, lets look at Ardour and why this really isn’t an issue in a properly designed and implemented non-destructive workflow. When dealing with Ardour’s normalization, what you get is the audio file analyzed and the value needed to actually normalize the audio file is stored in the region description in the session file. For instance one value from a normalized region I have right now is 1.8104865551. This value is then combined with the value for gain presented by the fader itself, region gain, and any other value in that method, to come up with the total gain applied to a region at any given point in time, a single gain operation is applied to the value, not multiple. Thus normalization, even in the extremely minute cases mentioned above, does not actually affect the audio at all by itself, but instead is combined with other methods of affecting gain before the audio data is modified at all.

So… how much of that makes sense?

     Seablade

beejunk · December 1, 2009, 9:12pm

I appreciate the very informative replies. Yes, Seablade, your post makes sense.

seablade · December 1, 2009, 8:27pm

And of course while I was typing, you get two other very qualified answers on the topic;)

 Seablade

seablade · January 2, 2010, 4:02pm

Here's a little number theory. 0.04523000000 has as much absolute precision as 1.86000000000 ... and the result, 0.08412780000 has exactly the same absolute precision (using the decimal allegory that you started.) For exactly that answer, there was no round-off error if the storage mechanism was the 11 decimal-point system I just proposed. That is how such numbers are stored in the computer. The real problem, as Paul pointed out, is that the intermediate result of a long string of operations really is what needs the larger bit-depth of storage. The Pentium class of processors use a really big accumulator for some of their floating point so Ardour, Cubase, ProTools, Logic, all of the plug-in developers etc. take advantage of these before they store back out to a newly generated intermediate track.

You are correct, however you also didn’t demonstrate any floating point operations that would require more precision, which as you and Paul both stated tend to happen with multiple operations stacked on top of each other. That is ALL I was trying to explain. Now if you want to be technical, in any of the cases you mentioned none of the numbers NEED as much precision as 11 decimal places in a base 10 system. In a base 2 system, which is what computers work off of, they still don’t need as much as you gave, but that isn’t the point. My point was entirely to give an example of how a destructive normalization process COULD cause a loss of precision in response to the poster directly above my post. If you read my points above you would realize I was not saying that it would be noticeable, and in fact in proper non-destructive systems(Such as Ardour) isn’t an issue at all.

The reason that @kvk's mix changes radically between listening venues ... Both Fletcher-Munson and patterning interference IMHO the most likely culprits.

There are many possible reasons, none of which have to do with the topic of normalization degrading signal quality, which was all I was focused on, and very few I could even guess at without knowing a lot more about the reproduction systems and acoustics in question. I could give likely candidates of course(And yours are quite possible, but also add in acoustics as well as a major contributor), but to give accurate responses would need much more.

   Seablade

Zzeon · December 31, 2009, 9:57pm

You guys are so silly…

Normalize amplifies the samples equally, it is a simple, find-the-peak, whats-it-take-to-get-it-to-100%, use that multiplier for all sample points.

Guys it’s like:
2 times 2 = 4
2 times 3 = six

The ratio is still the same, that is why the dynamic range is preserved and not affected. You want your old 2 and 3 back, cut the level by 50%.

From Adobe, and a very smart CoolEdit author:

Now when we choose the “normalize” effect, the software looks for the loudest point of the waveform, and then raises (or lowers) the amplitude of the entire waveform until the volume reaches a particular percentage of the clipping point. Every point in the waveform is amplified equally, so that the original dynamics of the piece are preserved. The default figure of 98% is a typical percentage figure for normalization.

Sure, the levels from track to track are lost and noise comes up, but isn’t it the role of mixing and mastering processes to define what those should really be, relatively speaking. YES, you want your quiet passage preserved, so just back the fader down for that track.

In one case, your pushing the level because it is too low - no normalization.
In the other, your backing the fader/level down because it’s high - attenuation

It’s a matter of preference, and experience. If you have a bad experience, then that tends to affect your approach to workflow.

Don’t be ascared of the big bad gain, it’s your friend, or can be if you treat it right.

Hey, after all…, your gonna pump it through some non-linear process to “normalize-by-ear” to a final master using discrete frequency bands of compression/expansion. That is why Katz gets paid the big money, his ability to hear how to make the final work the best it can be, in his mind, is why artists are drawn to his way of thinking (sonically, not technically).

Compression, now there is a word that carries more fear in my mind than linear transforms, and requires the most work to get it right…

sha!

seablade · December 31, 2009, 10:58pm

Zzeon…

Multiply .04523 by 1.86.

You get .0841278. A number that requires more bits to store precisely than either of the original numbers. You only have a finite storage space in a 32 bit float. Same with 64 bit. So eventually in some math you will get a result that requires much more precision than you have availiable, that is what they were referring to when talking about normalization being a process that removes information and the basis of Bob Katz’s statements. Your post ignores this completely.

However as I pointed out above, normalization as done in Ardour does not suffer to the same degree as it would via a destructive process. However in either case the actual difference is VERY minimal.

   Seablade

catraeus · January 2, 2010, 6:00am

@seablade …

Here’s a little number theory. 0.04523000000 has as much absolute precision as 1.86000000000 … and the result, 0.08412780000 has exactly the same absolute precision (using the decimal allegory that you started.) For exactly that answer, there was no round-off error if the storage mechanism was the 11 decimal-point system I just proposed. That is how such numbers are stored in the computer. The real problem, as Paul pointed out, is that the intermediate result of a long string of operations really is what needs the larger bit-depth of storage. The Pentium class of processors use a really big accumulator for some of their floating point so Ardour, Cubase, ProTools, Logic, all of the plug-in developers etc. take advantage of these before they store back out to a newly generated intermediate track.

Really, the lost precision for a floating-point system (single-precision) is that the mantissa has only 24 bits, which is 122 dB of background noise for a 0 dBFS signal. So the processing noise gain comes up by these nano-scale amounts each and every multiply. The multiply is in any mix, and the rendering to a master-ready set of tracks has to do this whether you use non-destructive or destructive editing tools. It’s literally nano-bits of noise addition, so it isn’t heard.

The only hearable (my word) processing noise comes from the tweaky filters we love to put in. They turn out to subtract two numbers from each-other which are very close to each other. The numbers there only leave a small number of mantissa bits that can’t be recovered when the floating-point accumulator puts the mantissa back up to full scale. When I say tweaky, I mean a radically high-Q filter. The gentle filtering of a brightness/darkness or whatever correction doesn’t add this kind of noise. A high-Q filter has really obvious noise generation which is frequency shaped on top of that, it isn’t white.

As a matter of course, I normalize all tracks immediately after capturing, then set levels in the mix. But when I do a good job of level-setting during capture, there were a few peaks that went over 0 dBFS and the normalizer gives up.

Now for the punch-line.

The reason that @kvk’s mix changes radically between listening venues … Both Fletcher-Munson and patterning interference IMHO the most likely culprits.

The apparent loudness of differing frequency content programs will vary dramatically because of the non-linear loudness detection that our ears impose. The mix in the monitors will not sound the same as the mix in the car. The only way to fix this is to make your engineering monitor room shaped like a car with motor and highway noises to boot. Or learn what gets boosted and ducked in the car compared to the studio.

Patterning interference is a nasty problem of multiple sources. You mentioned that you have two sets of monitor speakers. If they aren’t very nicely managed surround channels, then the two lefts will cause focusing to make nulls in frequency that change as you move around the room (likewise, of course, the two rights.) Get rid of the second set of speakers (keep the Rokits!) and you will have a better listening in the monitor.

Finally, I personally agree with the stereo sourcing for a fatter, sweeter sound. I record classical guitar and have found a stereo arrangement for two source positions to give me the best ability to image the sound.

I would say, however, that for a layered highly-tracked, dubbed, produced-till-it-screams session this technique will get lost in the juicy sauce that is being cooked. The mono source material is MUCH easier to keep clean, while still providing panning. There are some really great panning-reverb plugins that make image placement downright fun from a mono source.

my 2 ¢.

b.t.w. Paul, 1/3 is quite by definition rational … sqrt(2) is irrational.

catraeus

catraeus · January 2, 2010, 5:20pm

Right on @seablade. Thanks for the corrections and expansions.

The possible causes of the original problem are really big, and the number theory problems are also both many and arcane and not @kvk’s problem.

My point is that normalization and mixing in general are not a sonic problem in any modern DAW, The only problems I ever see with sonics due to number systems come from tight filtering or other radical processing. Even after 100 destructive operations of IEEE 32 float, that puts the noise floor at -102 dBFS due to number-system only. Starting off with normalized inputs, healthy record levels etc. stops that from becoming audible.

Finally, won’t the 32-float vs. 32-int wars be just as fun as the analog vs. digital wars 8-o

Catraeus