You cannot have talked about audio and computers any time in the last 15 years and not have heard of an MP3 file. MP3 audio files and websites, like the original Napster, started a shift in where, how and when people acquired music. If you are on the older end of the spectrum, like many of us in the mobile electronics industry, then you bought your CDs, cassettes and maybe even your vinyl at a record store. Computers and the Internet changed that. You could go online after dinner and download an illegal copy of a song in a few minutes. It was wrong, but people acquired tens of millions of songs this way.
In the 1990s and early 2000s, accessing the Internet was slow. We started connecting to the Internet using phone lines and modems. Each byte of information took time to transfer to your computer, so anything that would speed up the process was a treat. Downloading (stealing) music using the Internet is where the popularity of the MP3 audio file met its calling.
A Primer on Digital Audio
We could write 10 articles about digital audio – and we just might. For now, we are going to look at the basics and use the compact disc (CD) as our reference. CDs store digital audio sampled at 44.1 kHz with a resolution of 16 bits. These numbers mean each sample can have an amplitude that is a single value within a range of 65,536 different levels (2 to the power of 16). The information is sampled 44,100 times a second. Sampling at what is known as 44.1/16 allows capturing the audible range of audio (considered 20 Hz to 20 kHz) with good detail and accuracy.
To store 1 second of audio at this resolution, we need to store 1,411,200 bits of information. Anyone who has played with audio transcoding software may recognize 1,411 kbps as a standard data rate. This number is calculated by multiplying the number bits per sample (16) times the number of samples per second (44,100) times 2. The times-2 factor is because we record in stereo – which is two channels. So, a 3-minute long song is 254,016,000 bits or 31,752,000 bytes.
Let’s round it off to 31 megabytes of information. Can you imagine how long it takes to download that with a dial-up modem running at 14,400 baud? The answer is at least 3.5 minutes – without error checking, line noise and other factors that slow the real download time to about 5.5 minutes.
Data Compression
What if someone found a way to shrink the size of the audio file to speed up download time and reduce bandwidth usage? The caveat is that the audio still sounds essentially the same on most basic audio systems, such as a TV, computer speakers or a 1990s factory car radio. In 1991, a group of companies, including the Fraunhofer Institute, France Telecom, Philips, TDF and IRT, started working on a way to reduce file size while maintaining relevant information. That is the key to how file size is reduced using MP3 compression.
The MP3 file format is a “lossy compression” algorithm. Lossy compression means that information is thrown away to reduce file size. The development team worked on a compression method called perceptual encoding to decide what information to remove. Perceptual encoding is based on how we hear sounds relative to other information, and the limits of our hearing.
What MP3 Files Throw Out
We are going to analyze the information that MP3 files remove to reduce file size. One of the easiest ways to cut back on information storage is to reduce the highest frequency that will be reproduced. If we analyze a 128 kbps MP3 file, we see that the highest reproduced frequency is just below 16 kHz. If that were the only information that was removed, our new bitrate with 16-bit samples in stereo would be about 1,004,800 kbps instead of 1,411,200 kbps for 20.05 kHz.
The next part of the compression process analyzes content that is common to both channels. It is common for some parts of a recording to be virtually in mono. The encoding process removes duplicated information from the file and adds code to copy the opposite channel. If the audio track were purely mono, the file size would be divided in two. Few tracks are completely mono, but we can see more space saving from this process.
Subsequent processing looks at low-level information during high-amplitude passages. Let’s use the example of a song with a lot of bass in it and some very quiet harmonic midrange information. Perceptual encoding processes like MP3 will remove this low-level information from the audio track. This process is called audio masking. There is enough audio information at other frequencies to distract you from hearing what is removed.
Can You Hear the Difference?
Dozens – nay, hundreds – of tests have compared MP3 files to full CD-quality audio tracks. Are there differences? There most certainly are. One thing became apparent during our research: How an MP3 file is created is crucial to its subjective sound quality. Different encoders work in different ways with different results.
Perhaps the best way to describe the difference between a CD-quality recording and an MP3 file is to look at the difference between the two. I wish we could share some samples here for you to listen to, but that would break copyright laws. What we can do is visually show you the difference.
We took a 3-second sample from Daft Punk’s “Give Life Back to Music.” We chose this track because of Daft Punk’s clear and conscious effort to make a high-resolution version of the album commercially available. We want to thank them for that! The sample is from 31.5 seconds to 34.5 seconds into the song.
This Spectrogram shows the frequency content of the sample. The horizontal scale is time. The vertical scale is frequency. Finally, the color intensity shows the amplitude.
You can see that there is frequency content up to 30 kHz, clearly demonstrating the high-resolution nature of this track. Each vertical color band represents a drum machine beat – more or less.
128 kbs MP3 File Analyzation
It is clear that audio information above 16 kHz has been removed. Infrasonic frequency content is clearly different as well. There is more information in the MP3 file below 30 Hz compared to the original. This increase in information will, however, present itself as less-dynamic range.
MP3 Vs Original File
We inverted the MP3 file and added it to the original sample to make the image you see here. The net result is the difference between the two tracks. You can see the high-frequency content that was removed above 16 kHz. In fact, information was removed at all frequencies, and that information follows the intensity pattern of the audio file.
The original file has a peak amplitude of -0.1 dB for both channels and an average amplitude of about -14.2 dB. The removed information has a peak level of -10.9 dB and an average amplitude of -37.01. The removed information is buried deep below the peak amplitude information.
What does the removed audio sound like? We would describe the clip as the sound of a distant marching band. The audio is mostly high-frequency information. The track has a decidedly warbled texture to it as well: The drum machine beats are clear and present, but they sound like distorted cymbal hits.
Even with a high-end headphone preamp and studio grade headphones, the difference is hard to perceive when switching between the original track and the MP3 file. In a listening environment with a larger soundstage, it may be more apparent.
Conclusions about MP3 Files
Purists will tell you that you should have the highest-quality recordings available. There is no fault to this logic. Why skimp when you can have it all? High bitrate MP3 files, like those at 320 kbps, for example, are excellent in quality. Repeated testing has shown that when created with quality compression algorithms, the sound difference between a CD-quality recording and a 320 kbps MP3 file is almost impossible to detect. Lower bitrate MP3 files start to dispose of more information, and the differences become bigger.
The latest source units on the market are capable of playing WAV and FLAC audio files of great resolution and bit depth. Shortly, we will see units that will play MQA files over digital connections. Almost every source will handle MP3 and WMA files.
Drop into your local mobile electronics specialist retailer today, and bring along some music to enjoy. We think you will be impressed – no matter what format you choose.