Digitization of audio is the transformation of audio signals (live or recorded in analog magnetic media) in digital form (binary coded files for use in computers and the digital information technology).
.
There is many reasons that advocate to the need of Audio Digitization:
Audio signals are the representation of sound waves in electrical (or magnetic) form. Were graphically the x axis is the time and the y axis is the linear changes of voltage.
Digital technology handles linear data as a series of digits. In order to digitize the signal above we need to take samples of the signal in a predefined frequency. This is made with Analog to Digital (A/D) converters.
Through these samples is possible later with a Digital to Analog (D/A) converter to follow the opposite process e.g. to reconstruct the linear signal and produce sound waves.
The quality of digitization is depended on the following aspects:
While the first two points is subject of hardware quality the second two (3 and 4) is mainly subject of decision.
· Sample Frequency (SF): Since a wave has a full period, we need at least two samples to capture it (see figure). That means for an audio signal of frequency x Hz we need at least a sample frequency of 2x Hz to capture it (Nyquist theorem). Respectively for a full audio range to capture (20-20000Hz) we need at least a SF of 40.000 Hz. (CD quality⇒SF 44.1 kHz). Still this frequency is not sufficient for a high quality of digitization because for high frequencies (more than 10kHz) we have only 2-4 samples to capture them. That produce a “harder” high frequency and alias effects. To resolve this problem a higher SF (96kHz) is needed. Naturally for lower ranges (speech) the 44.1 kHz is sufficient.
· Resolution: The resolution of sample corresponds to the dynamic range (in dB) we capture. The more resolution we use the more dynamic range (gradation between quiet and loud ). The resolution is measured in bits (how many bits we use to describe the size of sample). The CD quality of 16 bits (actually 14) gives a gradation of 16.384 values or dynamic range of 96dB. As the human ear reacts in ranges till 120 dB for a high fidelity capture we need a higher bit depth (e.g. 24 bit).
S. frequency we use and the bit depth affects the size of the digital audio files. The table below shows a correspondence on how many space in Megabytes we need to store (uncompressed) one hour of digitized audio, using different SF and bits. (with blue are marked usual combinations).
| Sample frequency | |||||
| Bits | 32 | 44,1 | 48 | 88,4 | 96 |
| 16 | 460,8 | 635,04 | 691,2 | 1272,96 | 1382,4 |
| 24 | 691,2 | 952,56 | 1036,8 | 1909,44 | 2073,6 |
Digitized audio is stores in audio files. These files can be RAW (just numbers) or using codecs to encode or decode the raw data (describing the way the digital information is coded in the files).
Because the size of audio files is quite big and not very useful for transfer through the internet, several methods of audio file compression are developed to facilitate the audio distribution. These compressions deal between file size and quality.
There are three major groups of audio file formats:
From the three groups above the first one is used for master files and CDs, the second is less used and helps the transfer of master files through the internet and the third is very widely applied for distribution of music on the Internet and in some cases on audio on DVDs.
In compression audio formats besides the SF and Bit depth (bit per sample) there are other significant parameters such as:
· Bit rate: It is significant to mark the connection speed needed and affects the amount of compression. E.g. 196 kbits consumes during playback from the internet about 196kbps from the connection bandwidth.
· Variable Bit Rate (VBR): Is an option that specifies an option to use several bit rates according the complexity of the signal in order to use less banwidth if it is not needed. E.g. VBR 64 to 128 kbits uses 64kbits when the audio signal is simple and raises till 128kbits when the signal is complex.
· DRM: Some compression formats include Digital Rights Management systems in the way to protect the intellectual and commercial rights, allowing to the end user to play the file only under specific conditions.
· Multitrack support: Some codecs support more than two tracks in the way to play 5.1 surround sound.
The size of file is depended to the format/codec we use as shown in the table below
| Format | Resolution/bitrate | Size for 1 hour stereo music | Compression ratio | Hours of music in 1 DVD |
| WAV, AIFF | 24bit 96kHz | 2 025 GB | 1 | 2,2 |
| WAV, AIFF | 16bit, 44.1kHz | 620 MB | 1 | 7 |
| MP3 | 256 kbps | 112 MB | 5,5 | 37,5 |
| WMA | 128 kbps | 62 MB | 10 | 75 |
| AC3 | 128 kbps | 56 MB | 11 | 78,5 |
| MP3 | 128 kbps | 56 MB | 11 | 78,5 |
| MP3 | 64 kbps | 28 MB | 22 | 157 |
| OGG | 48 kbps | 20 MB | 31 | 220 |
| RM | 56 kbps | 16,3 MB | 38 | 270 |
| MP3 | 28 kbps | 10,6 MB | 58 | 415 |
| RM | 28 kbps | 9,6 MB | 64 | 460 |
In the table below the file formats/codecs are shown with detailed information for each one.
| Format | Creator | Release date | Algorithm | Sample Rate | Bit rate | Bits per sample | Common implementations | Patented | DRM |
| AAC | ISO/IEC MPEG Audio Committee | 1997 | Lossy, Hybrid | 8 Hz to 192 kHz[6] | 8 to 529 kbit/s (stereo) | Any (typically uses fp internally) | OSI: FAAC; Proprietary: iTunes, Nero Digital Audio | Yes | FairPlay |
| AC3 | Dolby Laboratories | 1992 | Lossy | up to 48Khz | max 640kbits/s | 16 or 24 | FFMpeg, DVD Audio tracks | Yes | ? |
| ALAC | Apple Computer | 28/4/2004 | Lossless | 44.1, 48 kHz | variable | ? | Proprietary: QuickTime, iTunes, Real Player | Yes | ? |
| ATRAC | Sony Corp. | 1991 | Lossless, Lossy | 44.1, 48 kHz | 48-352 kbits/s | 16 | MiniDisc, Walkman, VAIO, Clie, Playstation3 | Yes | Yes |
| FLAC | Xiph.Org Foundation | 20/7/2001 | Lossless | 1 Hz to 1048.57 kHz | variable | 4, 8, 16, 24 (32) | OSI: reference | No | No |
| Monkey's Audio | Matthew T. Ashland | 2000+ | Lossless | 8, 11.025, 12, 16, 22.05, 24, 32, 44.1, 48 kHz | variable | Proprietary | ? | No | |
| MP3 | ISO/IEC MPEG Audio Committee | 1993 | Lossy | 8, 11.025, 12, 16, 22.05, 24, 32, 44.1, 48 kHz | 8, 16, 24, 32, 40, 48, 56, 64, 80, 96, 112, 128, 160, 192, 224, 256, 320 kbit/s | Any (typically uses fp internally) | OSI: LAME; Proprietary: Nero | Yes | No |
| RealAudio | RealNetworks | 2005 | Lossless, Lossy | 4, 8, 11.025, 16, 22.05, 24, 32, 44.1, 48, 96 kHz | 16-352 kbit/s | Varies | Realplayer | Yes | |
| Speex | Xiph.Org Foundation | 24/3/2003 | Speech | 8, 16, 32 (, 48) kHz | NB: 2.15 to 24.6 kbit/s WB: 4 to 44.2 kbit/s | OSI: reference | No | ? | |
| Vorbis (Ogg) | Xiph.Org Foundation | 20/7/2002 | Lossy | 1 Hz to 200 kHz | variable | Any (typically uses fp internally) | OSI: reference, aoTuV | No | No |
| WavPack | Conifer Software | 1998 | Lossless, Lossy, Hybrid | 1 Hz to 16777.216 kHz | variable in lossless mode; 196 kbit/s and up in lossy mode (for CD audio) | varies in lossless mode; 2.2 minimum in lossy mode | OSI: reference | No | ? |
| Windows Media Audio | Microsoft Corp. | 1999 | Lossless, Lossy | 8, 11.025, 16, 22.05, 32, 44.1, 48, 88.2, 96 kHz | 4 to 768kbit/s, variable in lossless encoding | 16, 24 for lossless mode, any in lossy mode (typically uses fp internally) | Windows Media Player | Yes | Yes |
| Musepack (MPC) | Frank Klemm/MDT | 1997 | Lossy | Varies | up to 320 | Varies | OSI: reference | Not clear | ? |
Each compression methodcodec produces even with the same parameters different quality. In a public test made by Roberto Amorim (http://www.rjamorim.com/test/multiformat128/results.html) the results are shown as follows:
(1. Lame is the usual MP3 codec. 2 iTunes uses the AAC encoder)
Impressing is that according to the results above, usual formats and codecs (WMA, LAME, AAC) got lower ratings while less known –and actually open source free- codecs was surveyed as producing better audio quality.
For standard audio digitization process, any audio recording and editing capable software is convenient under the condition that it supports:
· our A/D hardware converter and
· the resolutions we want to use
· The file formats we want to export.
Good quality software distinguishes on
· the quality and amount of audio process they offer(e.g. fades, gain change, time compression, noise reduction-NR, restoration)
· mutitrack or surround (5.1) support
· third party Plug-ins (e.g. VST, direct X, Audio Units)
· number of audio formats they support (compressed and/or uncompressed)
· Id3 tags
· support of MIDI
· batch processing of multiple files
· additional control hardware support
Below is a list of common digital audio recorder/editors
| Name | Creator | Platform | Price in $ | Note | Futures/Compressed Formats | URL |
| Acoustica/ Pro | Acon Digital Media | Windows | 40/119.90 | Pro supports Multitrack, 5.1-7.1, VST | http://www.acondigital.com/ | |
| Amadeus Pro | HairerSoft | Mac OS X | 40 | Multitrack Mp3, Ogg Vorbis, Mp4, AAC, Batch, VST | http://www.hairersoft.com/AmadeusPro/ | |
| Ardour | Paul Davis | Mac OS X / Linux / Unix | Free | Multitrack, MIDI | http://www.ardour.org/ | |
| Audacity | Dominic Mazzoni | Linux / Mac OS X / Unix / Windows | Free | MP3, OGG,FLAC | http://audacity.sourceforge.net/ | |
| Audio Dementia | Holladay Audio | Windows | 39.95 | 5.1, VST | http://www.audiodementia.com/ | |
| Audition | Adobe Systems | Windows | 349 | Ex CoolEdit | Multitrack, 5.1, VST, MP3, batch | http://www.adobe.com/products/audition/ |
| Cubase | Steinberg | Windows / Mac OS X | 879 | Multitrack, 5.1, MIDI, VST, MP3 | http://www.steinberg.net/983_1.html | |
| Fast Edit | Minnetonka audio | Windows | 199 | direct X | http://www.minnetonkaaudio.com/fastedit/fastEdit.html | |
| Fission | Rogue Amoeba | Mac OS X | 32 | MP3, AAC, Apple Lossless, ID tags, ringtones | http://rogueamoeba.com/fission/ | |
| FlexiMusic | FlexiMusic | Windows | 20 | MP3, WMA, AU, RAW, SND, ID tags, batch | http://www.fleximusic.com/waveditor/overview.htm | |
| Goldwave | Goldwave Inc. | Windows | 45 | mp3, ogg, aiff, au, vox, mat, snd, voc, FLAC, batch, NR | http://www.goldwave.com/ | |
| Jokosher | Jokosher community | Linux | Free | multirack, MP3, Ogg Vorbis, FLAC, | http://www.jokosher.org/ | |
| Peak | BIAS | Mac OS X | 129/599 | mp3, VST, batch, NR | http://www.bias-inc.com/products/peakPro5/ | |
| Protools | DigiDesign | Windows / Mac OS X | from 399 | bandled with hardware | multitrack, TDM plugins, NR | http://www.digidesign.com/ |
| Pyramix | Merging Technologies | Windows | from 800 | DSP and Native version | multitrack, NR, VST, cedar plugins | http://www.merging.com/pyramix/ |
| Sequoia | Sequoia | Windows | 3500 | multitrack, NR, BWF | http://www.sequoiadigital.com/ | |
| MP3 Stream Editor | 3delite | Windows | 15 | Non-destructive audio editor | mp3, OGG, TTA, ALAC, AC3, Speex, AACH, ID tags | http://www.3delite.hu/MP3SE/ |
| n-Track Studio | FAsoft | Windows | 64 | multitrack, MP3, VST, DirectX | http://ntrack.com/ | |
| Nuendo | Steinberg | Wndows / Mac OS X | 1799 | Multitrack, 5.1, VST, MP3 | http://www.steinberg.net/89_1.html | |
| NU-Tech | Leaff Engineering | Windows | 250 | Available free SDK | http://www.nu-tech-dsp.com/ | |
| QuickAudio | Sion Software | Windows | Free | MP3, OGG, VST, NR | http://www.sionsoft.com/qaudio.html | |
| ReZound | Davy Durham | Linux | Free | Graphical audio file editor | MIDI, MP3, FLAC, OGG | http://rezound.sourceforge.net/ |
| SndBite | Bill Poser | Multiple platforms | Free | Future for cutting a recording into many small pieces. | MP3,CSL, SMP | http://billposer.org/Software/SndBite.html |
| Sound Forge | Sony | Windows | 69.95/262.95 | Formerly from Sonic Foundry | Multitrack, BWF, WMV, MP3, ID3 tag | http://www.sonycreativesoftware.com/products/soundforgefamily.asp |
| Soundbooth | Adobe | Mac OS X / Windows | 199 | Mac Intel only | MP3, WMA, NR | http://www.adobe.com/products/soundbooth/ |
| Sweep | Conrad Parker | Linux / Unix | Free | Multitrack. OGG, SPEEX, | http://www.metadecks.org/software/sweep/ | |
| Total Recorder | HighCriteria | Windows | 18/36 | wma, mp3, Ogg or FLAC, NR | http://www.highcriteria.com/ | |
| WaveLab | Steinberg | Windows | 550 | BWF, mp3, mp2, ogg, au, snd, ivc, osq, SDII, QT, MPEG, WMA, avi, VST | http://www.steinberg.net/128_1.html | |
| Wavosaur | Wavosaur Team | Windows | Free | mp3, VST, batch, NR | http://www.wavosaur.com/ | |
| WavePad/Pro | NCH Software | Windows / Mac OS X | Free/38 | mp3, vox, gsm, real audio, au, aif, flac, ogg | http://www.nch.com.au/wavepad/ | |
| WaveSurfer | Centre for Speech Technology at KTH | Multiple platforms | Yes | MP3, CSL, SD, Ogg, and NIST/Sphere | http://www.speech.kth.se/wavesurfer/ | |
| XO Wave | XO Audio | Linux / Mac OS X | Free/95 | MP3 | http://www.xowave.com/ |
If we have a big amount of files to handle, important parameter is the future of batch processing.
For the digitization process there is no need to purchase expensive software unless we want to re-master or clean/restore an old recording. Otherwise there is also a lot of free (like Audacity and Wavepad) or very cheep (like MP3 Stream Editor or Amadeus Pro) software that can serve the digitization and handling of audio.
Once the audio is digitized it can be easily converted in several compressed formats. According to our needs also maybe we want to cut or split the files. A series of small utilities can facilitate theses processes:
Audio Converter Pro, supports all the popular audio file formats (windows)
dbPowerAmp, converter from and to many formats (windows)
MediaCoder, an audio/video batch transcoder which uses open source audio and video codecs (Windows)
Mp3DirectCut, edit MP3 files without decoding or re-encoding them. (Windows)
Mp3splt, splits MP3 and OggVorbis files without decoding (see mp3splt-gtk and libmp3splt) (Linux, Macintosh, Windows)
iTunes, Apple’s management and conversion free software (Windows, Mac Os X)
Switch, converts almost any audio file format (Windows, Mac Os X)
winLAME, a tool to convert between various audio formats (Windows)
· Comparative test April, 2004
· Several comparative audio tests
· EBU subjective listening tests on low-bitrate audio codecs
· Interactive blind listening tests of audio codecs over the internet
· Hydrogenaudio comparison of lossless formats
· A Survey of Audio Coders for Electronic-Art Music (eContact!, May 2007)
· http://compression-links.info/Audio
· EBU subjective listening tests on low-bitrate audio codecs
· Martin Dietz and Stefan Meltzer: CT-aacPlus . a state-of-the-art audio coding system
· EBU Technical Review No. 291, July 2002., (http://www.ebu.ch/trev_291-dietz.pdf)
· Sinha, D. and Johnston, J. D., “Audio compression at low bit rates using a signal adaptive switched filterbank,” IEEE ASSP, 1996, pp. 1053-1057.
· Johnston, J. D., Sinha, D., Dorward, S. and Quackenbush, S., “AT&T perceptual audio coder (PAC)” in Collected Papers on Digital Audio Bit-Rate Reduction, Gilchrist, N. and Grewin, C. (Ed.), Audio Engineering Society, 1996.
· Joshua Haberman, Lossless Editing of Lossy-compressed Audio Data, 2004 (http://www.reverberate.org/computers/thesis/LosslessEditing.pdf)
· Audio editors for Linux, 2005 (http://www.linuxformat.co.uk/pdfs/LXF69.round.pdf )