5 Tips for Optimizing Audio File Size
At some point or another, everyone should learn how audio files work. This knowledge may seem trivial or unimportant, but it can come in handy when recording music, creating a podcast, or optimizing your music library.
This article will explore the various factors affecting audio quality and audio file size. Finding the perfect balance between the two isn’t easy, but you should know enough to feel comfortable experimenting on your own by the end.
Note: To put this knowledge into practice, you’ll want to get a free audio editor like Audacity or any alternative. Learning about these tools is beyond the scope of this article.
1. Sampling rate
In real life, sound is a wave. When someone speaks or claps their hands, what you actually hear is a change in pressure that travels through the air and eventually hits your eardrums.
But how to capture this sound and convert it into digital data? We cannot just record the complete sound wave as it is; instead, we need to take periodic “snapshots” of the sound over time. When you play it all in sequence, you get a rough recreation of the original sound.
Each snapshot is called a to taste, and the interval used between each snapshot is called the sampling rate. To define them, it is the number of digital snapshots taken per second in an audio file by an analog-to-digital converter. The sampling rate is measured in Hertz, so it can be expressed in frequency.
The shorter the interval, the faster the frequency. Faster frequencies produce more accurate recordings but also require more data to store every second of recorded sound.
For example, CD-quality audio uses a sample rate of 44.1 kHz (or 44,100 samples per second), while TV and DVD-quality audio uses a sample rate of 48 kHz. Considering a 10 minute uncompressed mono audio recording, the former might be 51.7MB while the latter would be 56.3MB.
You can go down to 32kHz for voice-only recordings and not experience much loss in quality, but stick with 44.1kHz if music is involved or you need the highest quality. A drop to 22.05 kHz will sound more like AM radio.
2. Bit rate
Bitrate is not the same as sample rate. Many people tend to confuse the two, but it’s important that you don’t. First of all, if the sample rate corresponds to the frequency at which snapshots of the sound are taken, the bit depth corresponds to the amount of data recorded during each snapshot.
To illustrate this, imagine a sound wave as a stream of water, and you try to capture (ie record) that water with a bucket. Sample rate is how often you dip your bucket into the stream, while bit depth is how big your bucket is. The bit depth measurement is in bits. For each bit increase, the recording precision doubles.
The higher the bit depth, the more data captured per sample. This leads to more accurate recording at the expense of more space required to store this data.
But if you lower the bit depth too much, the sound data is lost. Audio CDs use 16 bits per sample, while DVDs and Blu-ray discs use 24 bits for each sample.
Debit is the amount of actual sound data processed (expressed in kilobits per second). To get the bit rate, you multiply the sample rate by the bit depth. A CD audio file with a sample rate of 44.1 kHz and a depth of 16 bits would have an uncompressed bit rate of 44100*16, i.e. 705.6 kbps.
To give you an idea of the difference in file size, consider a five-minute uncompressed song recorded in two-channel stereo sound.
- 44.1kHz/16-bit: 44100*16*2 = 1411200 bits per second (1.4 Mbps)
- 192kHz/24-bit: 192000*24*2 = 9216000 bits per second (9.2Mbps)
Using the calculated bitrate, multiply it by the duration of the recording
- 1.4*300 = 420 MB or 52.5 MB
- 9.2*300 = 2760 MB or 345 MB
So audio recorded at 192kHz/24bit will take up six times as much space, but it all comes down to what you want to do with the audio recording. Sometimes the full bitrate is not needed in a given snapshot, such as when there is silence.
In this case you can use variable flow (VBR) supported by MP3, OGG, AAC and WMA. In the past, VBR wasn’t widely supported, but today that’s not much of an issue.
3. Stereo vs Mono
This point is pretty straightforward, so I’ll keep it brief. Mono signifies a channel, while Stereo means two channels. The two channels of a stereo audio file can be called “left” and “right” channels.
With a pair of headphones, you will be able to hear one of the stereo channels in one ear and the other stereo channel in the other ear. When listening to a mono audio file, you will hear exactly the same channel in both ears.
In a sense, stereo audio files are essentially two mono audio files in one, which means that a stereo audio file is always twice as large as a mono audio file, assuming the sample rate, depth bits, the source sound, etc. are the same. between the two. So, the easiest way to instantly reduce the size of an audio file in half is to convert it from stereo to mono.
For vocal-only recordings, mono is almost always preferred because it makes the sound loud, clear, and direct. But if you want to record two or more singers in a room with unique acoustics, the vocals should be stereo.
Similarly, podcast recording can also be mono. However, in musical recordings, a stereo is what makes a lot of music more three-dimensional, as if the music is playing around you rather than over you (i.e. mono sound is flatter).
If you are working with WAV files, the only way to reduce the file size is to change one of the above settings (sample rate, bit depth or number of channels). For everything else, compression is the most important factor in audio file size. There are two types of compression:
- lossy compression removes “unnecessary” data from audio, such as sounds that are beyond most people’s hearing range. Once compressed, such deleted data cannot be recovered.
- Lossless compression takes an audio file and compresses it as much as possible using mathematical algorithms. However, it must be decompressed at play time, which requires more processing power. No actual data is lost.
The compression mode you want to use depends on the intended use of the audio file. As a general rule, you should opt for lossless compression when you want to store a nearly perfect copy of the source material, and lossy compression when the imperfect copy is good enough for everyday use.
For example, you might want to keep your collection of ripped CDs in FLAC format (if storage space is not an issue) and use MP3 to store them on the phone. If you don’t know much about compression, here’s our comprehensive guide to how file compression works and a list of tools to effectively compress large audio files.
5. File format
Once you’ve decided to use lossy compression, you need to decide which file format is best for you. As of this writing, the three most popular options are MP3, OGG, and AAC. To learn more, read our guide on how to compare different audio file formats.
The MP3 is by far the most popular, mainly because it was the first of the three to hit the scene. AAC is technically better than MP3 but does not have the same usage rate. OGG is good too, but not many devices support it, so stick with MP3 or AAC.
Whichever you use, you will end up compressing to a target bitrate. Assuming you’re going to be using the MP3 format, here are the five most common bitrates currently in use:
- 64 kbps is AM radio quality. Perfect for conversation-only podcasts, as vocals aren’t as complex as music.
- 96 kbps is FM radio quality. The music will sound good, but you can tell it’s not full-bodied, mainly because some audible frequencies have been removed.
- 128 kbps is CD audio quality. It’s as standard as it gets. The music sounds “good enough” for most people at this rate.
- 256 kbps is of high audio quality. You may notice that some sounds and instruments were not detectable at lower bitrates.
- 320 kbps is the best audio quality. You can go higher, but you probably won’t be able to tell the difference, even if you consider yourself an audiophile.
In terms of file size reduction, an MP3 compressed at 128 kbps loses about 90% of the original sound data, while an MP3 compressed at 320 kbps only loses about 60%.
Also, if you have MP3 and AAC compressed to the same bitrate, AAC will often sound better because it uses a more advanced compression algorithm. This means that you can get more “quality per megabyte” with AAC than with MP3.
Optimize the size of your audio files
Understanding these five factors will help you decide the best way to save and compress the music and/or podcasts you’ve created, and help you decide what kind of music formats to buy or which streaming services to use.
Running out of disk space? Install one of these small and light Linux distros to make your PC usable again.
About the Author