i feel like putting this to rest.
sampling is the same thing as DOT PER INCH when describing a computers mouse resolution.
another example..
sampling is just like the amount of pixels in a picture.
and quite frankly, this is the easiest and closest comparison.
what is in front of the lens before the picture is taken is the actual raw/original (comparable to the raw sound).
when the camera takes a photo, the lens (and included analog to digital chips) chop up the scene beforth the lens into a grid. each x,y RGB coordinate is called a pixel.
the more pixels taken from the raw scene, the sharper and more visually pleasent the photo is (color accuracy doesnt count when talking about the number of pixels).
the sampling rate is comparable to the camera taking a picture for the following reasons.
the raw/original sound is beforth the microphone.
what happens when you record audio is, the waveform goes before an analog to digital convertor chip.
analog to digital and/or digital to analog chips are the pieces of hardware that have sample rates.
what the analog to digital chip does is just like the camera when it takes the raw/original and chops it up into pixels, except for audio, those pixels are called 'samples'.
and this is where you need to have advanced knowledge of how things work to speak about why things are the way they are.
so your analog to digital chip took the waveform at a resolution of 44.1khz samples.
(this is like saying the analog to digital chip in the camera took a photo at a resolution of 3 mega pixels)
what companies dont tell you, those analog to digital chips can distort.. and what distort means, there is an x,y coordinate that isnt perfectly aligned.
most x,y graphs start at 1x - 1y and work their way up.
in the example, the grid goes up to 4x - 3y.
a perfect analog to digital chip will accurately create a sample at each intersecting coordinate (0,0 1,1 2,1 3,1 ~ 1,2 1,3 1,4)
when an analog to digital chip distorts, it creates a distorted coordinate.. such as 1.30,2.60.
the chip tried to create coordinate 1,3 but made 1.30,2.60
as seen in the example below
when an analog to digital convertor distorts.. the coordinate 1,3 remains empty.
so when you play the waveform on your pc, you wont hear anything at all for coordinate 1,3 if your computer is playing that waveform at a sampling rate of 44.1khz.
that is a loss of detail.
when you increase the sampling rate on your computer, the digital to analog chip has more coordinates available.. and if you have more coordinates available, the chances of your digital to analog chip reading data from coordinate 1.30,2.60 has now become substantially higher.
normally the digital to analog chip on your computer will match each coordinate (1,3 with 1,3) when the sampling rate is at the same 44.1khz that the waveform was originally recorded.
but when you turn the sampling rate up on your computers sound card, the digital to analog chip starts to look for data 'in-between' the lines. and a good example of data in-between the lines is the 1.30,2.60 error.
the data that was read at coordinate 1.30,2.60 is totally free from harm, but the analog to digital convertor missed its mark.
rather than not hearing NEITHER coordinate 1,3 or 1.30,2.60 you can increase the sampling rate to a higher resolution than what the raw/original waveform was recorded with.
this is exactly why when you turn up the sample rate there is more detail in the music. coordinates like 1.30,2.60 are now being converted into an analog signal and can be heard.
if your analog to digital chip didnt made zero mistakes and all the coordinates have rounded numbers with no decimal places, then increasing your sampling rate on your digital to analog chip is not going to find any hidden coordinates.
but analog to digital chips vary in quality and therefore they do make mistakes when creating coordinates.
digital to analog chips vary in quality also and therefore they can and do make mistakes when reading coordinates.
this is why the resolution (sample rate) is so important when recording/playing-back audio.
ideally you want both chips looking at a resolution as high as they can go.
just like when taking a picture, you want the picture to look its best.. therefore you purchase a camera with the highest megapixels available.
although you dont want to record audio at a higher sampling rate than what you can play back.
for example.. you recorded yourself on a microphone at 96khz sample rate, but your computer can only play back that audio recording at 48khz sample rate.
this means @ 48khz there are coordinates that exist and are not being picked up by your digital to analog convertor, thus you hear a loss of quality/detail.
as a re-cap.. higher sample rates read coordinates with decimals such as 1.30,2.60
the maximum error fluctuation of 44.1khz is derived by a net-catch of 64khz (thus meaning 64khz will find absolutely all of the errors created by an analog to digital convertor with a recording rate of 44.1khz)
that is a government mandated quality-control law.
88.2khz drains the amplifiers voltage rail, resulting in the waveform seeking amperage from an outside powersource, thus increasing (well maximizing really) the harmonic distortion and signal to noise ratio.
what happens at 88.2khz is really just the digital to analog chip sucking up so much current that the amplifiers voltage rail stops looking at a 'buffer over-run' thus maximizing the amplifiers ability to amplify sound.
an example..
it is like doing aerobics with a sweatshirt on, and then taking that sweatshirt off to feel free and have unobstructed movement.
(it maximizes the signal to noise ratio.. just like taking the sweatshirt off maximizes arm and torso movement)
i'm quite confident that this write-up has informed those of you who had no clue how dac's and adc's work (and why their resolution/sample-rate is so important).
mp3 encoding simply removes x,y coordinates from the grid.. and that makes the file size smaller.
the differences in mp3 encoders are simply what coordinates are removed.
the codecs are programmed to remove certain frequencies for example.
others are programmed to begin with removing harmonic distortion and transients before taking the loud coordinates out of the grid (lame mp3 to be specific).
the lame mp3 codec is also programmed to remove coordinates that have no or little audio data (digital silence).
if you cant hear the difference between 44.1khz and 64khz you only have your ears and/or your speakers to blame.
most often their is a lack of quality in the speakers to allow the listener to hear the difference.
anything bose/klipsch or below is considered consumer-audio.
and consumer-audio is junk! (although bose is supposed to have the best detail and klipsch is supposed to be the loudest.. but they make products for average people with average needs)
and for the record.. bose is supposed to be 'bang-for-your-buck' performance, until you realize that you can buy speakers and build your own cabinets for a price of $250 for two bookshelf speakers that will blow the quality of your listening experience higher than you thought attainable in your own home.