Optimize for Youtube playback?


I do video for small business and non-profits. Pretty much most clients want their videos posted to Youtube. I have two (main) audio sources, a shotgun and a lav mic. I also tend to put background music or field sound effects, such as, if in a restaurant I will capture audio of the food cooking). How would you mix the two primary audio sources and what effects plugins would you run it through to optimize for Youtube playback? I would like an answer for 1) Linux environs and 2) Windows. I run Ardour on both platforms.


Hate to say it but do you have a year?

Your question is rather broad. ‘Optimizing’ for youtube playback can be either a misnomer misapplied by many people, or referring to a specific style of mastering that supposedly optimizes the final mix for compression to a lossy format, still not a common practice to my understanding and definitely would need a discussion on how lossy formats work, how the particular formats Youtube uses focus on specific audio areas, etc. Very detailed stuff, far more likely that this term is misapplied.

Along with this when talking about plugins, again it depends on the nature of the beats. I would suspect EQ and compression in some form to start with, but what else you might need, for instance to clean up audio, or create SFX, etc. is an entirely different topic.


@Seablade Forgive my ignorance here, but my impression is that you want to upload to Youtube at high resolution and then let them handle the compression. You will have many viewers/listeners play accross a multitude of devices from computers to tablets to phones. A mix has to be done to satisfy as many as possible. One person showed me a way with Adobe Audition, using a parametric eq. This guy knew what he was doing and did a great job. He did something called ‘sweeping the q’, where scanned across a range until he heard a whistling noise and then lowered a narrow band. I do not undertstand it fully. He also chopped off a chunk of the low end, which to me was counter intuitive until he explained that too much low-end leads to distortion. His mixes sounded great on Youtube. I no longer have Adobe Audition, but figure there are other tools that can do the job.

Yeah, I got a year.

Oh yeah, compression. Not to confuse the two, one form of compression for optimised listening experience and the Youtube compression for smooth playback across devices and resolutions. There are loads of videos on compression for dialog with the likes of Reaper, Audition, Izotope and so on. But woefully few on the Linux side of things where we have Calf and the X42 plugins for starters.

The last time I investigated this, the main points were:

  • Make the sound as good quality as possible to start with.
  • Make the video as high resolution as you can.
    YouTube will process your sound, but one thing it does is adjust the sound quality and bandwidth (i.e. data compression) to match the video’s quality and resolution - so if you only upload 640x480 video, you’ll only get mediocre sound, but if you upload 720p or 1080p, you’ll get better sound if you upload good sound with it.

There are guidelines on the YouTube site, and I know the rules used to keep changing, but I don’t know if that’s been the case recently.

I wouldn’t advise messing with EQ unless the sound has an audible frequency spectrum problem that you can fix.


Yea, so what you are describing is kind of the ‘misapplied’ terminology, added with a bit of confusion of mastering with EQ.

So prety much what you are looking for is to export as high quality as you can, which for Audio Ardour does by default (WAV is completely uncompressed audio for instance). For video it is a bit more complex, as it is unusual to have uncompressed audio muxed into a video file these days, but not completely out of the question either. But the end result is the same, higher bitrate, higher resolution exports (So as high a resolution as the source footage allows, and as high a bitrate as you could reasonably upload) and then upload that high quality file to youtube or any other service.

Sweeping the filter (Not really sweeping the Q as any high-q filter is going to have some resonance really) is a typical trick I teach my students to find troublesome frequencies when they are starting off especially and haven’t trained their ears to hear and identify frequencies. Generally the process is to raise the gain on an EQ filter to +15 or so, and then sweep the frequency until it sounds as bad as possible, then lower the gain to cut that frequency, being careful not to overdo it and destroy the sound more (This takes practice and time as you have to listen while you do it).

To much of any frequency can lead to distortion, or an unbalanced mix. Many lossy compression formats also remove LF content as well as few systems can reliably reproduce as low as we generally mix for so it doesn’t really do a lot for many people.

On the topic of compression and the videos you mentioned, the same principles apply. Most compressors have a ‘standard’ set of controls affecting Threshold, Ratio, Attack, and Release times. If you watch the videos and pay attention to those, then attempt them with the plugins you have access to many of the same principles can apply. It won’t sound exactly identical, but will likely be close enough you can start to learn those principles and what to listen for.

One of these days I need to start a video series on mixing with Ardour and Mixbus for beginners based similarly off what I teach in my classes I suppose. Need to have time to do it though which means need to make money so I can live to give me time to do it… a fun circle I suppose.


@anahata If I am not mistaken, different people speak in different ways. Some speak louder or quieter than others, and you have to boost gain for some and attenuate for others. You want to add multi-band compression and/or normalisation–but all this at the very end. One engineer says he always goes straight to the parametric equalizer at the start, to remove frequencies and then adds things like compression at the very end. You want consistency among audio in a video where many people speak.

@Seablade I shoot with a Panasonic GH3 in .MOV format with 16 bit video and I thnk the audio is PCM. One mic goes straight to camera, another mic (at 24 bit .wav) is connected to my Zoom H2 and I mix the two in post. Now the engineer did as you described, but once he found what he was looking for, he really narrowed that band to a fine point and lowered it. A video on basics with Ardour would be great because these are Libre versions of software and are accessible. Do you teach your students with the Ardour DAW or are you compelled to use proprietary? Thanks for the comments though. I am learning.

different people speak in different ways. Some speak louder or quieter than others, and you have to boost gain for some and attenuate for others
Agreed, but for me that's part of "make the sound as good quality as possible to start with" - i.e. it's not a YouTube specific thing but basic mixing skills. You'd do it for radio, video, broadcast, CD, or anything.

You might get better answers if you said you wanted expertise in balancing dialogue and other sounds (compression vs. gain riding, for example) for videos generally. Once you’ve got that right though, it should transfer to YouTube without a problem.

OK: one theoretical problem: It’s probably inadvisable to make a video with excessive dynamic range, and that would be worse on YouTube, but if you do that, YouTube will compress it anyway.

@anahata There we go, teach me how to ask better questions. “You might get better answers if you said you wanted expertise in balancing dialogue and other sounds (compression vs. gain riding, for example) for videos generally. Once you’ve got that right though, it should transfer to YouTube without a problem.” That is a good one, how would you approach this task via Ardour and attendant plugins?