Translations of this page:

image002.jpg image003.jpg image004.jpg


Survey on existing software solutions for digitisation of audio music files

What is digitization of audio?

Digitization of audio is the transformation of audio signals (live or recorded in analog magnetic media) in digital form (binary coded files for use in computers and the digital information technology).

.

Why is audio digitization necessary?

There is many reasons that advocate to the need of Audio Digitization:

  1. Prevent further time damages
    Magnetic tapes are demagnetized and loose their audio information.
  2. image005.jpgPrevent use damages
    Magnetic tapes are loosing their coat.
  3. Making safety copies of high quality
    Making safety or service copies of analogue material is expensive and produces lower quality 2nd generation copies. In contrast, once a document is digitized, it is very easy to produce copies of the same quality as the prototype digital copy
  4. Easy management of material
    Digitized documents can easily be linked to databases were they could be organized with their meta-data information. Furthermore the digitized material is much easier accessible.
  5. Copyright Protection
    Digitized material can be protected through several protection systems: digital watermarking, DRM (Digital Rights Management) etc.
  6. Restoration and optimization
    Digitized material can be restored through suitable software so noise can be and scratches can be reduced and sound quality can be expanded.
  7. Exploitation and promotion
    Digital audio files can easy be transformed to lower size preview or distribution files that can be easily available through digital editions and the internet.


How is audio digitization made

Audio signals are the representation of sound waves in electrical (or magnetic) form. Were graphically the x axis is the time and the y axis is the linear changes of voltage.

So What Is 24/ 96 For?Audio signal

Digital technology handles linear data as a series of digits. In order to digitize the signal above we need to take samples of the signal in a predefined frequency. This is made with Analog to Digital (A/D) converters.

So What Is 24/ 96 For?Samples of audio signal

Through these samples is possible later with a Digital to Analog (D/A) converter to follow the opposite process e.g. to reconstruct the linear signal and produce sound waves.

Critical aspects of digitization

The quality of digitization is depended on the following aspects:

  1. Good quality of analog signal.:
    Good quality of microphones or analogue reproduction machines).
  2. Good quality of converters
    Bad quality converters have less fidelity and reproduce unwanted side effects (digital noise, signal aliases etc)
  3. High resolution of samples
    It is obvious that the more the resolution of sample is high the more detail of information is stored.
  4. High frequency of sampling
    From the figure above we can understand that if we have a higher sampling frequency we come closer to the original signal.

Sample Frequency and Resolution

While the first two points is subject of hardware quality the second two (3 and 4) is mainly subject of decision.

· Sample Frequency (SF): Since a wave has a full period, we need at least two samples to capture it (see figure). That means for an audio signal of frequency x Hz we need at least a sample frequency of 2x Hz to capture it (Nyquist theorem). Respectively for a full audio range to capture (20-20000Hz) we need at least a SF of 40.000 Hz. (CD quality⇒SF 44.1 kHz). Still this frequency is not sufficient for a high quality of digitization because for high frequencies (more than 10kHz) we have only 2-4 samples to capture them. That produce a “harder” high frequency and alias effects. To resolve this problem a higher SF (96kHz) is needed. Naturally for lower ranges (speech) the 44.1 kHz is sufficient.

· Resolution: The resolution of sample corresponds to the dynamic range (in dB) we capture. The more resolution we use the more dynamic range (gradation between quiet and loud ). The resolution is measured in bits (how many bits we use to describe the size of sample). The CD quality of 16 bits (actually 14) gives a gradation of 16.384 values or dynamic range of 96dB. As the human ear reacts in ranges till 120 dB for a high fidelity capture we need a higher bit depth (e.g. 24 bit).

S. frequency we use and the bit depth affects the size of the digital audio files. The table below shows a correspondence on how many space in Megabytes we need to store (uncompressed) one hour of digitized audio, using different SF and bits. (with blue are marked usual combinations).

Sample frequency
Bits 32 44,1 48 88,4 96
16 460,8 635,04 691,2 1272,96 1382,4
24 691,2 952,56 1036,8 1909,44 2073,6

Audio file types

Digitized audio is stores in audio files. These files can be RAW (just numbers) or using codecs to encode or decode the raw data (describing the way the digital information is coded in the files).

Because the size of audio files is quite big and not very useful for transfer through the internet, several methods of audio file compression are developed to facilitate the audio distribution. These compressions deal between file size and quality.

There are three major groups of audio file formats:

  • Uncompressed audio formats, such as WAV, BWF, AIFF, SDII and AU:
    They are very similar formats. They use the PCM (Pulse Code Modulation) method of storing the audio data. Their difference is basically on the header were the Sampling rate, bit depth and other information are stored.
  • Lossless compression audio formats, such as FLAC, Monkey's Audio (.APE), WavPack, SHN, TTA Apple Lossless (ALAC) and lossless Windows Media Audio (WMA);
    Compressed file formats use the same space for silence or loud audio. Lossless compression audio formats use algorithms to distinguish parts with less information on the file and in this way they can reduce the file size (about 1/2) without any loss of information.
  • Lossy compression audio formats, such as MP3, OggVorbis, Windows Media Audio (WMA), Advanced Audio Coding” AAC etc.
    These file formats use codecs that remove less perceptible audio information, reduce a little the audio quality (smaller dynamic and frequency range) and use psychoacoustic processes in the way to produce much smaller files (less than 1/10 of the original size).

From the three groups above the first one is used for master files and CDs, the second is less used and helps the transfer of master files through the internet and the third is very widely applied for distribution of music on the Internet and in some cases on audio on DVDs.

In compression audio formats besides the SF and Bit depth (bit per sample) there are other significant parameters such as:

· Bit rate: It is significant to mark the connection speed needed and affects the amount of compression. E.g. 196 kbits consumes during playback from the internet about 196kbps from the connection bandwidth.

· Variable Bit Rate (VBR): Is an option that specifies an option to use several bit rates according the complexity of the signal in order to use less banwidth if it is not needed. E.g. VBR 64 to 128 kbits uses 64kbits when the audio signal is simple and raises till 128kbits when the signal is complex.

· DRM: Some compression formats include Digital Rights Management systems in the way to protect the intellectual and commercial rights, allowing to the end user to play the file only under specific conditions.

· Multitrack support: Some codecs support more than two tracks in the way to play 5.1 surround sound.

Storage size

The size of file is depended to the format/codec we use as shown in the table below

Format Resolution/bitrate Size for 1 hour stereo music Compression ratio Hours of music in 1 DVD
WAV, AIFF 24bit 96kHz 2 025 GB 1 2,2
WAV, AIFF 16bit, 44.1kHz 620 MB 1 7
MP3 256 kbps 112 MB 5,5 37,5
WMA 128 kbps 62 MB 10 75
AC3 128 kbps 56 MB 11 78,5
MP3 128 kbps 56 MB 11 78,5
MP3 64 kbps 28 MB 22 157
OGG 48 kbps 20 MB 31 220
RM 56 kbps 16,3 MB 38 270
MP3 28 kbps 10,6 MB 58 415
RM 28 kbps 9,6 MB 64 460

Common lossy compression formats

In the table below the file formats/codecs are shown with detailed information for each one.

Format Creator Release date Algorithm Sample Rate Bit rate Bits per sample Common implementations Patented DRM
AAC ISO/IEC MPEG Audio Committee 1997 Lossy, Hybrid 8 Hz to 192 kHz[6] 8 to 529 kbit/s (stereo) Any (typically uses fp internally) OSI: FAAC; Proprietary: iTunes, Nero Digital Audio Yes FairPlay
AC3 Dolby Laboratories 1992 Lossy up to 48Khz max 640kbits/s 16 or 24 FFMpeg, DVD Audio tracks Yes ?
ALAC Apple Computer 28/4/2004 Lossless 44.1, 48 kHz variable ? Proprietary: QuickTime, iTunes, Real Player Yes ?
ATRAC Sony Corp. 1991 Lossless, Lossy 44.1, 48 kHz 48-352 kbits/s 16 MiniDisc, Walkman, VAIO, Clie, Playstation3 Yes Yes
FLAC Xiph.Org Foundation 20/7/2001 Lossless 1 Hz to 1048.57 kHz variable 4, 8, 16, 24 (32) OSI: reference No No
Monkey's Audio Matthew T. Ashland 2000+ Lossless 8, 11.025, 12, 16, 22.05, 24, 32, 44.1, 48 kHz variable Proprietary ? No
MP3 ISO/IEC MPEG Audio Committee 1993 Lossy 8, 11.025, 12, 16, 22.05, 24, 32, 44.1, 48 kHz 8, 16, 24, 32, 40, 48, 56, 64, 80, 96, 112, 128, 160, 192, 224, 256, 320 kbit/s Any (typically uses fp internally) OSI: LAME; Proprietary: Nero Yes No
RealAudio RealNetworks 2005 Lossless, Lossy 4, 8, 11.025, 16, 22.05, 24, 32, 44.1, 48, 96 kHz 16-352 kbit/s Varies Realplayer Yes
Speex Xiph.Org Foundation 24/3/2003 Speech 8, 16, 32 (, 48) kHz NB: 2.15 to 24.6 kbit/s WB: 4 to 44.2 kbit/s OSI: reference No ?
Vorbis (Ogg) Xiph.Org Foundation 20/7/2002 Lossy 1 Hz to 200 kHz variable Any (typically uses fp internally) OSI: reference, aoTuV No No
WavPack Conifer Software 1998 Lossless, Lossy, Hybrid 1 Hz to 16777.216 kHz variable in lossless mode; 196 kbit/s and up in lossy mode (for CD audio) varies in lossless mode; 2.2 minimum in lossy mode OSI: reference No ?
Windows Media Audio Microsoft Corp. 1999 Lossless, Lossy 8, 11.025, 16, 22.05, 32, 44.1, 48, 88.2, 96 kHz 4 to 768kbit/s, variable in lossless encoding 16, 24 for lossless mode, any in lossy mode (typically uses fp internally) Windows Media Player Yes Yes
Musepack (MPC) Frank Klemm/MDT 1997 Lossy Varies up to 320 Varies OSI: reference Not clear ?


Quality comparison of compression formats

Each compression methodcodec produces even with the same parameters different quality. In a public test made by Roberto Amorim (http://www.rjamorim.com/test/multiformat128/results.html) the results are shown as follows:

image008.jpg

(1. Lame is the usual MP3 codec. 2 iTunes uses the AAC encoder)

Impressing is that according to the results above, usual formats and codecs (WMA, LAME, AAC) got lower ratings while less known –and actually open source free- codecs was surveyed as producing better audio quality.


Audio Digitization Software

For standard audio digitization process, any audio recording and editing capable software is convenient under the condition that it supports:

· our A/D hardware converter and

· the resolutions we want to use

· The file formats we want to export.

Good quality software distinguishes on

· the quality and amount of audio process they offer(e.g. fades, gain change, time compression, noise reduction-NR, restoration)

· mutitrack or surround (5.1) support

· third party Plug-ins (e.g. VST, direct X, Audio Units)

· number of audio formats they support (compressed and/or uncompressed)

· Id3 tags

· support of MIDI

· batch processing of multiple files

· additional control hardware support

Below is a list of common digital audio recorder/editors


| Name | Creator | Platform | Price in $ | Note | Futures/Compressed Formats | URL |

Acoustica/ Pro Acon Digital Media Windows 40/119.90 Pro supports Multitrack, 5.1-7.1, VST http://www.acondigital.com/
Amadeus Pro HairerSoft Mac OS X 40 Multitrack Mp3, Ogg Vorbis, Mp4, AAC, Batch, VST http://www.hairersoft.com/AmadeusPro/
Ardour Paul Davis Mac OS X / Linux / Unix Free Multitrack, MIDI http://www.ardour.org/
Audacity Dominic Mazzoni Linux / Mac OS X / Unix / Windows Free MP3, OGG,FLAC http://audacity.sourceforge.net/
Audio Dementia Holladay Audio Windows 39.95 5.1, VST http://www.audiodementia.com/
Audition Adobe Systems Windows 349 Ex CoolEdit Multitrack, 5.1, VST, MP3, batch http://www.adobe.com/products/audition/
Cubase Steinberg Windows / Mac OS X 879 Multitrack, 5.1, MIDI, VST, MP3 http://www.steinberg.net/983_1.html
Fast Edit Minnetonka audio Windows 199 direct X http://www.minnetonkaaudio.com/fastedit/fastEdit.html
Fission Rogue Amoeba Mac OS X 32 MP3, AAC, Apple Lossless, ID tags, ringtones http://rogueamoeba.com/fission/
FlexiMusic FlexiMusic Windows 20 MP3, WMA, AU, RAW, SND, ID tags, batch http://www.fleximusic.com/waveditor/overview.htm
Goldwave Goldwave Inc. Windows 45 mp3, ogg, aiff, au, vox, mat, snd, voc, FLAC, batch, NR http://www.goldwave.com/
Jokosher Jokosher community Linux Free multirack, MP3, Ogg Vorbis, FLAC, http://www.jokosher.org/
Peak BIAS Mac OS X 129/599 mp3, VST, batch, NR http://www.bias-inc.com/products/peakPro5/
Protools DigiDesign Windows / Mac OS X from 399 bandled with hardware multitrack, TDM plugins, NR http://www.digidesign.com/
Pyramix Merging Technologies Windows from 800 DSP and Native version multitrack, NR, VST, cedar plugins http://www.merging.com/pyramix/
Sequoia Sequoia Windows 3500 multitrack, NR, BWF http://www.sequoiadigital.com/
MP3 Stream Editor 3delite Windows 15 Non-destructive audio editor mp3, OGG, TTA, ALAC, AC3, Speex, AACH, ID tags http://www.3delite.hu/MP3SE/
n-Track Studio FAsoft Windows 64 multitrack, MP3, VST, DirectX http://ntrack.com/
Nuendo Steinberg Wndows / Mac OS X 1799 Multitrack, 5.1, VST, MP3 http://www.steinberg.net/89_1.html
NU-Tech Leaff Engineering Windows 250 Available free SDK http://www.nu-tech-dsp.com/
QuickAudio Sion Software Windows Free MP3, OGG, VST, NR http://www.sionsoft.com/qaudio.html
ReZound Davy Durham Linux Free Graphical audio file editor MIDI, MP3, FLAC, OGG http://rezound.sourceforge.net/
SndBite Bill Poser Multiple platforms Free Future for cutting a recording into many small pieces. MP3,CSL, SMP http://billposer.org/Software/SndBite.html
Sound Forge Sony Windows 69.95/262.95 Formerly from Sonic Foundry Multitrack, BWF, WMV, MP3, ID3 tag http://www.sonycreativesoftware.com/products/soundforgefamily.asp
Soundbooth Adobe Mac OS X / Windows 199 Mac Intel only MP3, WMA, NR http://www.adobe.com/products/soundbooth/
Sweep Conrad Parker Linux / Unix Free Multitrack. OGG, SPEEX, http://www.metadecks.org/software/sweep/
Total Recorder HighCriteria Windows 18/36 wma, mp3, Ogg or FLAC, NR http://www.highcriteria.com/
WaveLab Steinberg Windows 550 BWF, mp3, mp2, ogg, au, snd, ivc, osq, SDII, QT, MPEG, WMA, avi, VST http://www.steinberg.net/128_1.html
Wavosaur Wavosaur Team Windows Free mp3, VST, batch, NR http://www.wavosaur.com/
WavePad/Pro NCH Software Windows / Mac OS X Free/38 mp3, vox, gsm, real audio, au, aif, flac, ogg http://www.nch.com.au/wavepad/
WaveSurfer Centre for Speech Technology at KTH Multiple platforms Yes MP3, CSL, SD, Ogg, and NIST/Sphere http://www.speech.kth.se/wavesurfer/
XO Wave XO Audio Linux / Mac OS X Free/95 MP3 http://www.xowave.com/


If we have a big amount of files to handle, important parameter is the future of batch processing.

For the digitization process there is no need to purchase expensive software unless we want to re-master or clean/restore an old recording. Otherwise there is also a lot of free (like Audacity and Wavepad) or very cheep (like MP3 Stream Editor or Amadeus Pro) software that can serve the digitization and handling of audio.

Audio Compression and Manage Software

Once the audio is digitized it can be easily converted in several compressed formats. According to our needs also maybe we want to cut or split the files. A series of small utilities can facilitate theses processes:

Audio Converter Pro, supports all the popular audio file formats (windows)

dbPowerAmp, converter from and to many formats (windows)

MediaCoder, an audio/video batch transcoder which uses open source audio and video codecs (Windows)

Mp3DirectCut, edit MP3 files without decoding or re-encoding them. (Windows)

Mp3splt, splits MP3 and OggVorbis files without decoding (see mp3splt-gtk and libmp3splt) (Linux, Macintosh, Windows)

iTunes, Apple’s management and conversion free software (Windows, Mac Os X)

Switch, converts almost any audio file format (Windows, Mac Os X)

winLAME, a tool to convert between various audio formats (Windows)


References

· Comparative test April, 2004

· Several comparative audio tests

· EBU subjective listening tests on low-bitrate audio codecs

· Interactive blind listening tests of audio codecs over the internet

· Hydrogenaudio comparison of lossless formats

· A Survey of Audio Coders for Electronic-Art Music (eContact!, May 2007)

· http://compression-links.info/Audio

· EBU subjective listening tests on low-bitrate audio codecs

· Martin Dietz and Stefan Meltzer: CT-aacPlus . a state-of-the-art audio coding system

· EBU Technical Review No. 291, July 2002., (http://www.ebu.ch/trev_291-dietz.pdf)

· Sinha, D. and Johnston, J. D., “Audio compression at low bit rates using a signal adaptive switched filterbank,” IEEE ASSP, 1996, pp. 1053-1057.

· Johnston, J. D., Sinha, D., Dorward, S. and Quackenbush, S., “AT&T perceptual audio coder (PAC)” in Collected Papers on Digital Audio Bit-Rate Reduction, Gilchrist, N. and Grewin, C. (Ed.), Audio Engineering Society, 1996.

· Joshua Haberman, Lossless Editing of Lossy-compressed Audio Data, 2004 (http://www.reverberate.org/computers/thesis/LosslessEditing.pdf)

· Audio editors for Linux, 2005 (http://www.linuxformat.co.uk/pdfs/LXF69.round.pdf )

 
wp7.txt · Last modified: 2008/03/31 15:00 by 195.251.97.20
 
Recent changes RSS feed Creative Commons License Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki