I think IMO you should just put all theory in a side for a moment and put practice at full.
Do the upsampling, and hear the result and see what you really hear, trust your ears on this not graphs. Use as many monitoring sources for this even on headphones and do some A/B comparisons between both files. Try to put psychoacoustics aside also, what I mean by that is don’t hear what you want to hear or what you’ve read about on the web or other forums. In many forums you will find the weirdest posts that will affect how you hear things later because psychology really plays a huge roll.
Maybe there is artifacting or degradation, but the real question is, can YOU hear it?
I think today’s technology is far better than 10-20 years ago when these type of conversions were more of an issue, even some of today’s cheaper audio cards and processes perform better than some of the high end digital stuff in that era.
is there an advantage to recording at 48kHz if the final product is going to be 44.1?
If you are using any non-linear processing (distortion, compression/limiting) use of a higher sampling rate will reduce aliasing artifacts.
Distortion is intended to produce harmonics, but if some of the higher harmonics have significant levels in the range 22 - 44kHz, their aliases will not be harmonically related and won’t sound right.
Obviously you have the same problem with 48kHz sampling, but the harmonics have to be 8kHz higher for the same level of audibility.
Similarly compressors and limiters with very fast attack times produce transients going up to high frequencies whose aliasing products may be audible and pushing the sampling rate up will reduce their level.
The effect is slight, but it’s enough for the makers of some plugins to do oversampling internally to reduce the aliasing.
out of curiosity; is there an advantage to recording at 48kHz if the final product is going to be 44.1?
My best guess would be that the math would be more precise in the mixing and summing of tracks (though I get that bit-depth is the real math variable, if you will).
Just curious, and I have a hunch that at least a few people around here know the answer
Edit: Now I’m thinking that it only effects the time(frequency) domain. And how could up-sampling degrade the signal if it’s inserting values that weren’t there to begin with? (I’m assuming the simplest algorithm would just insert a signal-value based on something like an average of neighboring values) I don’t know much about DSP, but I’d like to understand more.
And how could up-sampling degrade the signal if it’s inserting values that weren’t there to begin with?
precisely because its inserting values that weren’t there to begin with. In simple terms, upsampling is achieved by ‘stuffing’ or padding the existing data with zeros, and then filtering. The design of the filter dictates how close the new values will be to the actual values obtained if the original waveform been sampled at a higher rate. What you are describing by suggesting taking the average of two samples is simple linear interpolation, which performs very badly for audio.
(By taking the average of two known points you are assuming the intervening samples lie on a straight line between those two points, which in most cases is a naive assumption)
Recording at higher sample rates has benefits in terms of the maximum frequency that can be recorded (although since even for 44.1 this lies outside of the average human hearing range, the actual benefits are always a subject of contention)
However, if you are processing the signal e.g. with an EQ, there can be other benefits, for example:
A simple EQ peak filter response will tend to 0dB at the limit set by the sample rate, this has the effect of ‘cramping’ the response of the filter at higher freqencies. The same EQ run at a higher sample rate will obviously not cramp the response so noticeably in the same frequency range. This is one of the reasons some EQs are run at higher sample rates internally (although this is a tradeoff since the upsampling / downsampling involved in the processing adds its own artefacts as previously mentioned.)
sample rate conversion always degrades the signal to a small degree. How much depends on the algorithm used, but in simple terms, upsampling will involve adding samples to the audio which weren’t in the original recording, and whose values can therefore only be an approximation. Similarly downsampling involves removing samples, which by definition will degrade the audio slightly. Sample rate conversion can’t always be avoided, but with a good algorithm, the effect is probably no worse than bouncing a track on a studio-quality tape machine.