Linux Sound and Audio

Table of Contents

last revision
08 August 2015, 12:06pm
book quality


This book tries to contain succinct recipes for using and editing audio with the Linux operating system. The main emphasis is on using the command line (bash shell). Topics may include: speech recognition, speech synthesis, sound file splitting and joining, sound effects, music synthesis ...

To select a device for recording, use set followed by the name of the

some difference

Players And Jukeboxes ‹↑›

and another

some tools
waon - converts waves to notes
timidity - converts notes to waves

Configuring The Audio System ‹↑›

test speaker channels

 speaker-test -D plug:surround51 -c 6 -l 1 -t wav

Alias for quick command-line volume set (works also remotely via

 alias setvol='aumix -v'

try to fix a ubuntu broken sound server

 sudo killall -9 pulseaudio; pulseaudio >/dev/null 2>&1 &

Disable annoying sound emanations from the PC speaker

 sudo rmmod pcspkr

reload the alsa sound system when it stops playing

 sudo alsa force-reload

Streaming Audio Files ‹↑›

The process of 'streaming' an audio file means to make the audio file available to users one little chunk at a time. The advantage of this is that the user can start listening to the sound file before the entire file has downloaded to her computer.

stream an mp3 file 'song.mp3' using the netcat tool

 nc -l -p 2000 < song.mp3

save the tuesday 14.00pm sbs language program to a file

 mplayer -dumpstream -dumpfile sbs.dump

recode the raw audio obtained above as a wav file 'audiodump.wav'

 mplayer -ao pcm sbs.dump

Playing Sound Files ‹↑›

Before you begin playing sound, make sure you've set the master and PCM volume levels with the mixer

Playing One Sound File ‹↑›

play audio but dont show video from mp4

 mplayer -vo null video.mp4
 mplayer -novideo video.mp4  # the same

play the file 'pentastar.aiff'

 play pentastar.aiff

play the sound file 'new.wav' starting at 20 seconds in the file

 play new.wav trim 20

play a random '.wav' file somewhere on the computer

 play "$(locate '*.wav' | shuf | head -1)"

play a file by copying to /dev/dsp but doesnt work for me

 sox $(locate '*.wav'|shuf|head -1) -t ossdsp > /dev/dsp

select a 'wav' sound file in the current folder to play

 PS3="Enter a number: "; select f in *.wav;do play $f; break; done

play the MP3 file 'september-wind.mp3'

 mpg321 september-wind.mp3

play 'in.wav' removing all silence at the beginning

 play in.wav silence 1 0 2%

music players
amarok -
rhythmbox -

Playing Remote Files ‹↑›

play the MP3 stream at the url


play the MP3 stream from a url with a 2MB audio buffer

 mpg321 -b 2048

Playlists Or Multiple Sound Files ‹↑›

play a random audio file whilst applying a bass boosting effect

 play "$(locate '*.ogg' | shuf | head -1)" bass +3

play the first three wav files found

 play $(locate '*.wav' | head -3)

play the first three mp3 files in the 'eu' folder tree

 mpg321 $(locate '*/eu/*.mp3' | head -3)

play each 'wav' sound and then sleep for 1 second

 locate '*.wav' | shuf | xargs -I{} echo "play {};sleep 1" | bash

Store a playlist of mp3s on variable and play with mpg123

 PLAYLIST=$(ls -1) ; mpg123 -C $PLAYLIST

Random Order Playing ‹↑›

play in random order '.wav' sound files

 locate '*.wav' | shuf | xargs -I{} play "{}"

play mp3s in the current folder in a random order

 mpg321 -z *.mp3

play an infinite random loop of mp3s in the current folder

 mpg321 -Z *.mp3

play all mp3s in a random order and pause for a user key press

 for i in $(locate -i "*.mp3" |shuf); do mpg321 $i; read r; done

play a random list, pause and play again if 'a' is pressed

command line audio file players
play - the sox player
mpg321 - for mp3s
mpg123 - same
afplay - the osx player ?
playsound - can play mp3s as well
lame -

play each sound in random order and then sleep for 1 second

 locate '*.wav' | shuf | xargs -I{} bash -c "play {};sleep 2"
 locate '*.wav' | shuf | xargs -I{} bash -c "play \"{}\";sleep 2"

another way to achieve the same thing

 locate '*.wav' | shuf | xargs -I{} echo "play {};sleep 1" | bash

In order to kill the above process one needs to press <control c> several times.

play in random order '.wav' and '.mp3' sound files

 locate '*.wav' '*.mp3' | shuf | xargs -I{} playsound "{}"

play in a loop random '.wav' sound files from the 'ja' folder tree

 locate '*/ja/*.wav' | shuf | xargs -I{} play "{}"

play in a loop random '.mp3' sound files from the 'eu' folder tree

 locate '*/eu/*.mp3' | xargs -I{} mpg321 -z "{}"
 locate '*/eu/*.mp3' | shuf | xargs -I{} mpg321 "{}"

create and use a function to play a random loop of sound in a folder

 so(){ locate "*/$1/*.mp3" | shuf | xargs -I{} mpg321 "{}"; }; so eu

randomly play a mp3 file

 mpg123 "`locate -r '\.mp3$'| awk '{a[NR]=$0}END{print a['"$RANDOM"' % NR]}'`"

present a menu of mp3 files with 'indo' in the name and play one

 IFS=$'\n';select f in $(locate '*indo*mp3'); do mpg321 $f; done

the same as above but slower (because of -r regular expressions)

 IFS=$'\n';select f in $(locate -r '.*indo.*.mp3$'); do mpg321 $f; done

The 'IFS=' trick above allows the command to handle sound files with spaces in their names.

a bash function to find and play sound with a particular name/path

 so(){ select f in $(locate "*$1*mp3"); do mpg321 $f; done;}; so eu

a bash function to choose and play a sound file

    function indo
      echo 'searching for sound files ...'; IFS=$'\n';
      select f in $(locate -r '.*indo.*.mp3$'); do
        mpg321 $f;

Buffering audio may be useful for when the system is running many processes or otherwise has a lot of activity,

play the MIDI file 'copa-cabana.mid'

 playmidi copa-cabana.mid

play the MIDI file 'copa-cabana.mid' on a non-MIDI sound card

 playmidi -f copa-cabana.mid

while [ "$r" != "q" ] ;do read -p'>' r; done; unset r

Internet Radio ‹↑›

 for i in $(locate -i "*.mp3" |shuf);{ r=a; while [ "$r" == "a" ]; do mpg321 $i; read -p'>' r; done; unset r; }

play random music from station ''

 mpg123 `curl -s | sed -e 's#"#\n#g' | grep mp3$ | xargs`

Listen to BBC Radio from the command line.

 bbcradio() { local s;echo "Select a station:";select s in 1 1x 2 3 4 5 6 7 "Asian Network an" "Nations & Local lcl";do break;done;s=($s);mplayer -playlist ""${s[@]: -1}".asx";}

listen to an internet radio station with 'mplayer'

 mplayer mms://server:port/path

plays music from somafm

 read -p "Which station? "; mplayer --reallyquiet -vo none -ao sdl${REPLY}.pls

Play music from youtube without download

 wget -q -O - `youtube-dl -b -g $url`| ffmpeg -i - -f mp3 -vn -acodec libmp3lame -| mpg123 -

Play Lists ‹↑›

internet radio tools
get-iplayer - bbc iplayer radio stations
mimms - down load mms:// streams

add 10 random unrated songs to xmms2 playlist

 xmms2 mlib search NOT +rating | grep -r '^[0-9]' | sed -r 's/^([0-9]+).*/\1/' | sort -R | head | xargs -L 1 xmms2 addid

Generate a playlist of all the files in the directory, newer first

 find . -type f -print0 | xargs -r0 stat -c %Y\ %n | sort -rn | gawk '{sub(/.\//,"",$2); print $2}' > /tmp/playlist.m3u

Editing Sound Files ‹↑›

playlist formats

graphical sound editing tools
audacity - the most popular graphical sound editor
sweep -
rosegarden - another graphical editor
jokosher - another sound editor

open the sound file 'mixdown.wav' in the program 'snd'

 snd mixdown.wav

Sox Pipe Lines ‹↑›

pipe the result of one sox command to another sox command

 play "|sox -n -p synth 2" "|sox -n -p synth 2 tremolo 10" stat

Mixing ‹↑›

Mixing is the process of overlaying 2 or more sounds to create a new one.

mix together two audio files and save a 'mixed.flac'

 sox -m music.mp3 voice.wav mixed.flac

Splitting Sound Files ‹↑›

Splitting sound files refers to dividing a single sound file up into several smaller files each containing a time interval of the original file.

command line sound editing
sox - perhaps the best tool
cutmp3 - a command line interactive editing tool for mp3 files only
quelcom - command line editing of wav and mp3 audio

copy the first 40 seconds of 'old.wav' in the sound file 'new.wav'

 sox old.wav new.wav trim 0 40

Split lossless audio (ape, flac, wav, wv) by cue file

 cuebreakpoints <cue file> | shnsplit -o <lossless audio type> <audio file>

split 's.wav' into multiple file each 5 seconds long

 sox s.wav out.wav trim 0 5 : newfile : restart

This produces a series of files called 'out001.wav out002.wav ...'

split a sound file on silence of 0.6 of a second or more

 sox in.wav out.wav silence 1 0.1 1% 1 0.6 1% : newfile : restart

A series of files out001.wav, out002.wav, out003.wav will be created containing the chunks of the file which has been split on pauses.

Here the '1%' means that if the noise threshold is below 1% then that is considered to be silence.

split a sound file on silence of 0.9 of a second or more

 sox in.wav out.wav silence 1 0.1 0.1% 1 0.9 0.1% : newfile : restart

Here the threshold of noise is much lower but it doesnt make much difference with digital recordings. The value of 0.9 seconds may be adecuate for splitting sentences uttered by people.

play the sound file with 3 seconds of silence at the beginning

 play b.wav pad 3

split an mp3 audio file on intervals of silence and remove the silence

 mp3splt -s -p rm songs.mp3

Joining ‹↑›

Joining here means adding a sound file on to the end of another

= tools

concatenate 'a.wav' and 'b.wav' to make 'ab.wav'

 sox a.wav b.wav ab.wav

join together a ".wav" files in this folder and save as "big.wav"

 sox *.wav big.wav

join all wavs together in a random order and save as 'big.wav'

 sox $(shuf -e *.wav) big.wav

add 1.5s silence at end of each wav and join them in a random order

 for f in *s.wav; { sox $f yyzz.$f pad 0 1.5; }; sox $(shuf -e yyzz.*) big.wav

add all mp3 files in this folder to 'ff_MP3WRAP.mp3'

 mp3wrap -a ff_MP3WRAP.mp3  *.mp3

create a new mp3 containing all mp3s starting with 'b'

 mp3wrap b*.mp3

There seems to be a limit of about 200 files which can be placed in an mp3wrap file.

Splicing ‹↑›

Splicing refers to inserting a sound within another (without mixing).

Silence ‹↑›

Silence is golden and is to sound what the number zero is to mathematics.

Removing Silence ‹↑›

play in.wav without any silence at the beginning

 play in.wav silence 1 0 2%

In this case 2% indicates that the volume level must be under 2% for the area of the sound file to be considered silence.

remove all silence from the beginning of in.wav

 sox in.wav out.wav silence 1 0 2%

remove all silence from the beginning of in.wav

 sox in.wav out.wav silence 1 0 2%

play in.wav without any silence at the beginning

 play in.wav silence 1 0.50 0.1%

trim silence (anything < 1% volume) until sound lasting more than 0.1s

 sox in.wav out.wav silence 1 0.1 1%

trim silence until we detect at least 0.3 seconds of noise, and then trim everything after we detect at least 0.3 seconds of silence.

 sox in.wav out.wav silence 1 0.3 1% 1 0.3 1%

trim silence from the beginning and middle of in.wav.

 sox in.wav out.wav silence 1 0.1 1% -1 0.1 1%

The key trick here is the -1 parameter.

play in.wav without any gaps or silence

 play in.wav silence 1 0.1 1% -1 0.2 1%

The '0.2' here means trim all silence after sox encounters at least 0.2 seconds of silence (that is below the 2% noise threshhold)

trim out all silence except pause of 0.5 seconds or less

 sox in.wav out.wav silence 1 0.1 1% -1 0.5 1%

reduce all pauses of more than 1 secs to 1 seconds

 sox in.wav out.wav silence -l 1 0.1 1% -1 1.0 1%

play with all pauses of more than 1/2 secs reduced to 1/2 second

 play in.wav silence -l 1 0.1 1% -1 0.5 1%

Adding Silence ‹↑›

pad the sound file 'b.wav' with 3 seconds of silence at the beginning

 sox b.wav output.wav pad 3

pad 'b.wav' with 1.5 secs of silence at start and end, save as 'out.wav'

 sox b.wav out.wav pad 1.5 1.5

add 1.5 secs of silence at the end of in.wav and save as 'out.wav'

 sox in.wav out.wav pad 0 1.5

add 1,1.5 and 2 secs of silence at at the positions 5,10,15 seconds

 sox in.wav out.wav pad 1@0:05 1.5@0:10 2@0:15; play out.wav

place 4000 samples of silence at position 3 minutes in the file

 sox in.wav out.wav pad 4000s@3:00;

place 2 seconds of silence at position 1 minutes 10 seconds

 sox in.wav out.wav pad 2@1:10;

Pitch ‹↑›

play in.wav with the pitch raised by 2 semitones

 play in.wav pitch 200

play in.wav with the pitch lowered by 4 semitones

 play in.wav pitch -400

raise the pitch of in.wav by 1 semitone and save as out.wav

 sox in.wav out.wav pitch 100

Sampling Quality ‹↑›

change the sample rate to very low bandwidth (and quality)

 sox in.wav out.wav rate -q; ls -thor in.wav out.wav

Visualizing ‹↑›

create and view a spectogram of the sound file

 sox in.wav -n spectrogram; feh spectrogram.png

create and view a spectogram of seconds 20 to 25

 sox in.wav -n trim 20 25 spectrogram; feh spectrogram.png

Looping ‹↑›

create a new sound file 'out.wav' with in.wav looped once.

 sox in.wav out.wav repeat 1

a simple loop to play random files and then do something to it

 while true; do play "$(locate '*.wav'|shuf|head -1)"; read -n1 x; done

Reverb ‹↑›

play in.wav with a reverberation (like in a concert hall)

 play in.wav reverb

add a reverberation to in.wav and save in out.wav

 sox in.wav out.wav reverb

Reversing ‹↑›

play in.wav backwards

 play in.wav reverse

Adjusting Audio Speed ‹↑›

increase the speed of the file 'slow.aiff' a little, save a new.aiff

 sox slow.aiff new.aiff speed 1.027

Effects ‹↑›

discard the right channel of a 2 channel file

 sox file1.wav out.wav mixer r

discard right channel, apply lowpass filter, clone left channel and save

 sox file1.wav - mixer r | sox - - lowpass 8000 | sox - file2.wav channels 2

play a tremolo synthesised note

 play "|sox -n -p synth 2 tremolo 10 fade 0 2 1" stat

perform a format translation, but also applies four effects (down-mix to one channel, sample rate change, fade-in, normalise), and stores the result at a bit-depth of 16.

 sox -b 16 recital.wav channels 1 rate 16k fade 3 norm

listen to 'test.wav' with 'reverb' added

 sox test.wav -d .5 1000 100

add basic reverb to 'old.wav' and write the output to file 'new.wav'

 sox old.wav new.wav .5 1000 100

add a spacey, echoing reverb to file 'test.wav' and save to 'new.wav'

 sox test.wav new.wav reverb 1 1000 333 333 333 333  no ???

add a 100 millisecond echo to 'old.wav' and write output to 'new.wav'

 sox old.wav new.wav echo .5 .5 100 .5

add a one-second echo to 'old.wav' and save changes in 'new.wav'

 sox old.wav new.wav echo .5 .5 1000 .5

add a deep, "alien-sounding" chorus to 'old.wav' and save to 'new.wav'

 sox old.wav new.wav chorus 1 .5 100 1 5 9 -t

add a subtle "vibro-champ" effect to 'old.wav' and save in 'new.wav'

 sox old.wav new.wav vibro 1

add an effect of a "maxed-out vibro-champ" to the file 'old.wav'

 sox old.wav new.wav vibro 30 1

add an "underwater" flange to the file 'old.wav' and save to 'new.wav'

 sox old.wav new.wav flanger .5 .5 4 .5 1 -t

add a phased "breathing" effect to 'old.wav' and save to 'new.wav'

 sox old.wav new.wav phaser .5 .5 .5 .9 .5 -t

Using a decay greater than .5 may result in feedback

add a 100 millisecond chorus to 'old.wav' and save to 'new.wav'

 sox old.wav new.wav chorus 1 .5 100 1 1 1 -t

add a "tin-can" echo effect to 'old.wav' and save in 'new.wav'

 sox old.wav new.wav echo 1 .5 5 .5  see also 'echos'

add "wah-wah"-like flange to 'old.wav' and write the output to `new.wav'

 sox old.wav new.wav flanger .5 .5 .5 1 2 -t

add a heavy phase to 'old.wav' and write the output to 'new.wav'

 sox old.wav new.wav phaser 1 .5 4 .5 1 -s

produce a 3 second, 48kHz, audio with a sine-wave swept from 300 to 3300 Hz:

 sox -n output.wav synth 3 sine 300-3300

the same as above but here it instead of saving it to a file

 play -n synth 3 sine 300-3300

mp3splt - splits mp3/ogg files on silence
oggsplt - the same as 'mp3splt'
sox - sox can split of silence
mp3wrap - join lots of mp3 files into one
mp3splt - split into separate files those joined with mp3wrap
www: reverb
www: vibro-champ
www: flange

Finding Sound Files ‹↑›

The 2 command line tools are 'find' and 'locate'. Locate is fast and simple, but cant find very recent files. Find is slow and complicated but can do everything. But these tools only search by the sound file name. To actually find real mp3 files you'll have to use the 'file' tool.

find files with a '.mp3' extension

 locate -r '.mp3$'

find '.wav' and '.mp3' named files

 locate -r '.mp3$' -r '.wav$' | less

find all mp3 named files in the users home folder

 locate -r "$HOME.*.mp3$" | less

locate real mp3 files (not just those with the file extension)

 locate -r '.mp3$' | xargs -I{} file "{}" | grep -i audio | less

see how long it takes to examine almost all the mp3 files

 time { locate -r '.mp3$' | xargs -I{} file "{}" | grep -i audio; }

 find ~ | file -f - | grep audio | less

find 'mp3' and 'wav' files in the users folder tree

 find ~ -iname '*.mp3' -o -iname '*.wav' | less

find all audio files in the users home folder tree

 find ~ | file -f - | grep audio | less

Recording Sound ‹↑›

effects jargon

record a wav file from the microphone and save it to a file 'hello.wav'

 rec hello.wav

this begins an 8,000 herz, monaural 8-bit wav recording, which is not particularly good quality (fidelity), but it was is used more, or less for telephone-voice fidelity.

make a high-fidelity stereo recording from the mic saving to 'new.wav'

 rec -s -c 2 -r 44100 new.wav

Record microphone input and output to date stamped mp3 file

 arecord -q -f cd -r 44100 -c2 -t raw | lame -S -x -h -b 128 - `date +%Y%m%d%H%M`.mp3

select the LINE IN jack as the recording source

 amixer set Line capture

select the microphone jack as the recording source

 amixer set MIC capture

record the input of your sound card into ogg file

 rec -c 2 -r 44100 -s -t wav - | oggenc -q 5 --raw --raw-chan=2 --raw-rate=44100 --raw-bits=16 - > MyLiveRecording.ogg

make a recording in CD audio format, and save to 'goodbye.cdr'

 rec goodbye.cdr

Record MP3 audio via ALSA using ffmpeg

 ffmpeg -f alsa -ac 2 -i hw:1,0 -acodec libmp3lame -ab 96k output.mp3

Stereo Recording ‹↑›

record half an hour of stereo audio

 rec -c 2 radio.aiff trim 0 30:00

Multi Track Recording ‹↑›

record a new track in a multi-track recording.

 rec -M take1.aiff take1-dub.aiff

Recording From Cassette ‹↑›

record a stream of audio such as LP/cassette and splits in to multiple audio files at points with 2 seconds of silence. Also, it does not start recording until it detects audio is playing and stops after it sees 10 minutes of silence.

 rec -r 44100 -b 16 -s -p silence 1 0.50 0.1% 1 10:00 0.1% | \
 sox -p song.ogg silence 1 0.50 0.1% 1 2.0 0.1% : newfile : restart

Audio Cds ‹↑›

www: debian
package: cdtool
show a list of tracks on a compact disc
 cdir  might print 'unknown cd - 43:14 in 8 tracks...'

eject a CD


This command will also eject an unmounted CD-ROM (data CD)

convert the file 'new.wav' to an audio CD format file

 sox new.wav new.cdr

use cdinfo to display information about an audio CD

Playing Audio Compact Disks ‹↑›

play an audio cd with mplayer

 mplayer -cdrom-device /dev/sr0 cdda://

play an audio CD


play an audio CD, beginning with the third track

 cdplay 3

play an audio CD, from the 1st track to the 4th track

 cdplay 1 4

play only the third track of an audio CD

 cdplay 3 3

pause the current CD playback


restart a paused CD


restart a paused CD from the beginning

 cdplay x

stop the current CD playback


use cdplay with the 'shuffle' argument to play the CD tracks in random

 cdplay shuffle

Copying Cd Audio ‹↑›

copy track 7 of audio cd to a cd-quality wav file in current directory

 cdda2wav -t7 -d0 -x -D /dev/cdrom

copy all tracks on an audio CD to CD-quality CD audio-format files

 cdda2wav -D /dev/cdrom -x -O cdr -d0 -B

sample the third track from a scratched audio cd in the default cd-rom drive using "paranoid" data verification, and write the output to a wav format file in the current directory

 cdparanoia -w 3-3

sample the entire audio CD using "paranoid" data verification

 cdparanoia -w -B

Burning Compact Discs ‹↑›

The issues involved in 'burning' or writing compact discs are... audio or digital cd, input format of files, spaces in names of files etc.

recording tools

Cd record seems to have been replaced by wodim on some debian style distributions.

install some good tools for command line audio burning

 apt-get install cdrecord wodim ffmpeg normalize-audio libavcodec52

make the file names a bit less problematic (no spaces, commas etc)

 rename 'y/_,-/ /;s/ +/./g' *

BUT, check what that will do first with

 rename -n 'y/_,-/ /;s/ +/./g' *

convert all music files to .wav format for burning

 for i in $( ls ); do ffmpeg -i $i $i.wav; done

burn audio cd with wodim

 wodim dev=/dev/cdrw -v -audio *.wav

burn an incomplete audio cd with wodim (more tracks addable later)

 wodim dev=/dev/cdrw -v -nofix -audio *.wav

But the cd above may not be playable in many cd players until the audio cd is 'fixed'

burn an audio cd with explicit device name

 wodim dev=ATAPI:/dev/hda -audio -v -eject *.wav

blank a cd/rw in preparation for writing new data

 wodim -vv dev=/dev/cdrw blank=all

burn 'cello.cdr' to the disc in the CD-R drive whose target ID is 2 on the primary SCSI bus

 cdrecord dev=0,2,0 -audio cello.cdr

burn all '.cdr' files in this folder at double speed to the CD-R drive whose target ID is 2 on the primary SCSI bus, and give verbose output

 cdrecord dev=0,2,0 speed=2 -v -audio *.cdr

run a test burn of 'symphony.cdr' to the disc in the CD-R drive, target ID is 6 (LUN 1) on the primary SCSI bus

 cdrecord dev=0,6,1 -dummy -audio symphony.cdr

burn the data track 'band-info' and all the audio tracks in the current directory with a '.cdda' extension to the CD-R drive whose target ID is 2 on the primary SCSI bus

 cdrecord dev=0,2,0 band-info -audio *.cdda

Analysing Audio Cds ‹↑›

Commands to examine what is on an audio compact.

Converting Sound Formats ‹↑›

burning tools
brasero - a graphical application for the gnome linux desktop
wodim - a modern version of cdrecord
mp3cd - Burns normalized audio CDs from MP3s/WAVs/Oggs/FLACs
mp3roaster - A script for burning CDs out of MP3/OGG/FLAC/WAV files

sound formats
flac - compressed but with information loss
mp3 - proprietary compressed
ogg - unproprietary compression
wav -

An important distinction in sound file formats is between the file format (wav, etc) and the encoding format (mp3, pcm, etc) The encoding format relates to the compression of each individual sample in the sound file.

translate an audio file in Sun AU format to a Microsoft WAV file,

 sox recital.wav

convert wav files to flac

 flac --best *.wav

converts a single flac file with associated cue file into multiple

 cuebreakpoints "$2" | shnsplit -o flac "$1"

converts 'raw' (or headerless) audio to a self-describing file format

 sox -r 16k -e signed -b 8 -c 1 voice-memo.raw voice-memo.wav

batch convert from OGG to WAV

 for f in *.ogg ; do mplayer -quiet -vo null -vc dummy -ao pcm:waveheader:file="$f.wav" "$f" ; done

Convert a bunch of oggs into mp3s

 for x in *.ogg; do ffmpeg -i "$x" "`basename "$x" .ogg`.mp3"

Convert .wma files to .ogg with ffmpeg

 find -name '*wma' -exec ffmpeg -i {} -acodec vorbis -ab 128k {}.ogg \;

create a file 'f.flac', a lossless compression of s.wav

 flac s.wav

convert a sound file from 'wav' format to 'ogg' format

 oggenc trac9.wav

convert from wav to ogg with highest quality

 oggenc -q10 trac9.wav

convert from flac to ogg

 oggenc trac9.flac

convert a wav file 'track9.wav' to mp3 format with variable bitrate

 lame -h -V 6 track9.wav track9.mp3

encode from wav to mono channel mp3 sampled 22050 herz, bitrate 64k

 ffmpeg -i test.wav -acodec libmp3lame -ac 1 -ar 22050 -ab 64k test.mp3

It is probably better and simpler to use lame to encode to mp3

show what audio file formats are supported by ffmpeg

 ffmpeg -formats


Since the 'mp3' file format (encoding) is so common, it is a very usual task to have to convert an audio file, to or from this format. Since mp3 is a proprietary encoding many linux programs to not, by default support this format, but this is a minor obstacle which is easily overcome.

mpg321 and mpg123 are very similar, except that mpg321 is optimized for computers which dont have a floating point processor. Also mpg321 does not have all the functionality of mpg123.

convert a sound file from 'wav' format to 'mp3'

 lame test.wav test.mp3  this is slow

convert to an mp3 with variable bitrate

 lame -h -V 6 track9.wav track9.mp3

convert wav into mp3 using lame

 lame -V2 rec01.wav rec01.mp3

convert 'sound.mp3' into a wav file 'new.wav' (a new file is created)

 mpg321 -w new.wav old.mp3   the file 'old.mp3' is unchanged
 mpg123 -w new.wav old.mp3   the same

convert mp3 into mb4 (audiobook format)

 mpg123 -s input.mp3 | faac -b 80 -P -X -w -o output.m4b -

an old method of converting to 'wav' format, but it doesnt work!

 mpg321 -b 10000 -s remix.mp3 | sox -t raw -r 44100 -s -w -c 2 - remix.wav

if sox was compiled with mp3 support, you can convert with

 sox test.mp3 test.wav   see above for how to compile sox
 sox -g | grep AUDIO     to see if sox has mp3 support

convert mp3 audio file to 'ogg vorbis' format

 mp32ogg file.mp3

convert all '.aif' files to mp3 changing the file extension to '.mp3'

 for f in *.aif; do lame $f "${f%.*}.mp3"; done

convert all mp3s in folder to wavs, save in 'so' folder with '.wav' name

 for f in *.mp3; { mpg321 -w so/${f%.*}.wav $f; }

convert all wavs in the folder to mp3

 for f in *.wav; { lame $f; }

convert all wavs in the folder to mp3 changing the file extension to 'mp3'

 for f in *.wav; { lame $f ${f%.*}.mp3; }

Using The Sox Tool ‹↑›

conversion software
glame -
mame -
lame -
wav2cdr - Converts wav files into CD-ROM audio file format

Installing Sox ‹↑›

install sox on a debian-type linux system

 sudo apt-get install sox

install sox on a redhat-type linux system

 sudo yum install sox

sox options or switches
-c - the number of channels
-n - use a null file as input or output
-p -

see if sox has support for mp3 files

 sox -h | grep AUDIO

add read (play, convert) support for mp3 files to sox

 sudo apt-get install libsox-fmt-mp3

But sox will still not be able to create mp3 files. This is probably a good thing. Use 'ogg' format instead (which is not proprietary) look through the list of file types for 'mp3'

if sox has no mp3 support, just convert to 'wav' format with

 mpg321 -w new.wav sound.mp3

play the sound file 's.wav'

 play s.wav

make the volume level in sound files more or less the same


reverse the sound in 'old.wav' and write the output to 'new.wav'

 sox old.wav new.wav reverse

Fidelity Or Sound Quality ‹↑›

The quality or 'fidelity' of a sound file is affected by the number of samples per second, the number of channels (mono, stereo etc) the encoding (mp3, pcm, etc) and the bit size of each sample. Generally telephone sound data is of a low quality and cds high

change the sampling rate of 'old.wav' to 7,000 Hz, and write to 'new.wav'

 sox old.wav -r 7000 new.wav

Volume ‹↑›

www: deb:
normalize-audio a package for 'normalising' (evening out) the sound volume in an audio file or files
In the world of sound, volume is often referred to as 'gain' and is stored as 'meta-data' within the sound file, in other words, the gain or volume has nothing to do with the amplitude of the sound wave.

steps to install sox from source code with mp3 support
cd /usr/local/src
tar zxvf sox-n.n.n.tar.gz
install libmed/libmad/lame for mp3
sudo make install

peruse the current mixer settings

 amixer | less

output the microphone settings

 amixer get MIC

output the second PCM settings

 amixer get PCM,1

Set the master volume to 90%

 aumix -v 90

To change a mixer setting, give the amixer 'set' command as an option

set the master volume to 75 percent

 amixer set Master 75%

set the PCM volume to 30

 amixer set PCM 30

The special 'mute' and `unmute' arguments are used for muting the

unmute the microphone and turn it on for recording

 amixer set MIC unmute capture

mute the microphone

 amixer set MIC mute

unmute the master volume and set it to 80 percent volume

 amixer set Master 80% unmute

Increase the mplayer maximum volume

 mplayer dvd:// -softvol -softvol-max 500

Normalising Audio ‹↑›

sound volume tools
aumix -
amixer -
alsamixer -
mplayer -

automatically adjust (normalise) the volume for a set of mp3 files

 mp3gain -a -k *mp3

automatically adjust the volume for a single mp3 sound file

 mp3gain -r -k *mp3

undo the changes made by mp3gain on the sound file 's.mp3'

 mp3gain -u s.mp3

increase the default volume (89db) by 2 decibels of sound 'quiet.mp3'

 mp3gain -r -d 2.0 quiet.mp3

increase the volume by a gain of 3 the sound file 'quiet.mp3'

 mp3gain -g 3 quiet.mp3

adjust the volume in sound files


raise volume and unmute if necessary

 amixer -c 0 set Master 1+ unmute

raise the volume of 'old.wav' twofold and write the output to 'new.wav'

 sox -v3 old.wav new.wav

lower volume of file 'old.wav' by half and write the output to 'new.wav'

 sox -v.5 old.wav new.wav

raise the volume of 'quiet.cdr' as high as possible without distortion

 sox quiet.cdr loud.cdr stat -v   might print  '3.125'

Extracting Audio From Video ‹↑›

A video may have more than one audio track (for example, for different languages) and for this reason you need to be careful that you extract the correct track.

normalize-audio - normalises the volume of WAV MP3 and OGG files
mp3gain - Lossless mp3 normalizer with statistical analysis

extract one audio track from 'movie.VOB' saving as 'f.wav'

 mplayer movie.VOB -ao pcm:file=f.wav

But if there are several audio tracks (for different languages) you may not get the one you want. You can use 'avidemux' for this or try the following...

extract an audio track from a multilingual video file

 mencoder -aid 2 -oac copy file.avi -o english.mp3

extract audio stream from an AVI file

 mencoder "${file}" -of rawaudio -oac mp3lame -ovc copy -o audio/"${file/%avi/mp3}"

Extract an audio track from a video file

 mencoder -of rawaudio -ovc copy -oac mp3lame -o output.mp3 input.avi

Extract audio from Mythtv recording to Rockbox iPod using ffmpeg

 ffmpeg -ss 0:58:15 -i DavidLettermanBlackCrowes.mpg -acodec copy DavidLettermanBlackCrowes.ac3

Dump an audio stream from flv (using mplayer)

 mplayer -dumpaudio -dumpfile test.mp3 test.flv

extract audio from start and end position in video

 mplayer -vc null -vo null -ao pcm <input video file> -ss <start> -endpos <end>

create file audiodump.wav with audio from second 195 to second 246 (the opnening theme).

 mplayer -vc null -vo null -ao pcm Fireflyep10.avi -ss 195 -endpos 246

Synth Music ‹↑›

tools to extract audio
avidemux - a graphical editor which allows easy audio extraction
mplayer -
mencoder -
ffmpeg -

play the note e once

 play -n synth 4 pluck E2

pluck the note 'd3' twice, slowly

 play -n synth 4 pluck D3 repeat 1

a bash function to pluck a note

 pluck() { play -n synth 4 pluck $1 repeat 1; }

pluck the note 'd3' twice quickly

 play -n synth 2 pluck D3 repeat 1

play a highish A note (440 HZ)

 play -n synth 4 pluck A4 repeat 2

pluck the note D (in the 3rd octave) 2 times quickly

 play -n synth 1 pluck D#3 repeat 1

pluck a small chromatic scale in the 4th octave

 for n in E4 F4 G4;do play -n synth 1 pluck $n;done

play all the guitar string notes 3 times each

 for n in E2 A2 D3 G3 B3 E4;do play -n synth 4 pluck $n repeat 2;done

extract some notes from the beep man page (???)

 man beep | sed -e '1,/Note/d; /BUGS/,$d' | awk '{print $2}' | xargs -IX

play a nice A-minor seventh chord with a pipe-organ sound, 2 secs duration

 play -n -c1 synth sin %-12 sin %-9 sin %-5 sin %-2 fade h 0.1 2 0.1

sox synthesizer options
synth 2 - a faster speed that 3 slower than 1
repeat 2 - repeat twice (so 3 times in total)
pluck D#3 - a D# guitar note in the 3rd octave (higher than 2)

play a nice 'a' note for 2 seconds, fading slowly out (over 1 second)

 play -n -c1 synth sin %-12 sin %-9 sin %-5 sin %-2 fade h 0.2 2 1

play gradually descending organy sounds

 for x in {1..12}; { play -n -c1 synth sin %-$x sin %-9 sin %-5 sin %-2 fade h 0.2 1 .5 ;}

Create an audio test CD of sine waves from 1 to 99 Hz

 (echo CD_DA; for f in {01..99}; do echo "$f Hz">&2; sox -nt cdda -r44100 -c2 $f.cdda synth 30 sine $f; echo TRACK AUDIO; echo FILE \"$f.cdda\" 0; done) > cdrdao.toc && cdrdao write cdrdao.toc && rm ??.cdda cdrdao.toc

Information About Audio Files ‹↑›

This section details how to find out information about sound files (this information may be referred to as 'meta-data'). This information may include such things as the duration in hours minutes and seconds of a sound file, when it was recorded, what sound file format the file is in, etc.

options and effects used
-n - use a 'null' file as input (only useful with 'synth')
-c1 - one channel (mono?)

Duration Of The Audio ‹↑›

find the duration of the sound file 's.wav'

 soxi -d s.wav
 soxi s.wav | grep -i duration   the same

display info about all '.wav' files

 locate '*.wav' | xargs -d'\n' -I{} soxi -d "{}" | less

display the file name and duration for all '.wav' files

 locate '*.wav' | xargs -d'\n' -I{} bash -c "basename {}; soxi -d {}" | less

display the file name and duration for all '.wav' files

 locate '*.wav' | xargs -d'\n' -I{} bash -c "basename {}; soxi -d {};"

sort by duration in seconds for all '.wav' files

 locate '*.wav' | xargs -d'\n' -I{} bash -c "echo {}; soxi -D {}" | paste - -| sort -nk2 | less

 locate '*.wav'|xargs -d'\n' -I{} bash -c "echo {}; soxi -D {}" | less

display the duration of all '.wav' files

 locate '*.wav' | xargs -d'\n' -I{} soxi -d "{}" | less

The -d'\n' is necessary if the file names contain quotes, since this will actually crash xargs and the rest of the files will not be processed.

display the real sound file type of all files with a '.wav' extension

 locate '*.wav' | xargs -d'\n' -I{} soxi -t "{}" | less

Samples ‹↑›

Metadata Tags ‹↑›

Some audio files may have textual information encoded into the file. This information is refered to as 'tags'. The latest version of sox can write these tags in mp3 files. This 'meta-data' is used to provide information about the audio file, such as the name of the musician, the title of the song, or the type of music or sound which is contained in the file.

Update Ogg Vorbis file comments

 for f in *.ogg; do vorbiscomment -l "$f" | sed 's/peter gabriel/Peter

ID3 ....

'id3' is the format for textual information which can be included in an mp3 file. There are several versions of id3 tags.

soxi - can display all kinds of information

set the album name for all mp3s in this folder to 'archie'

 eyeD3 -A archie *.mp3

set the album name to 'yolgnu phrases' for all files ending in 's.mp3'

 eyeD3 --album="yolgnu phrases" *s.mp3

Speech Synthesis ‹↑›

id3 tools
eyeD3 - displays and edits mp3 id3 tags
id3 - doesnt seem to handle recent id3 versions
id3ren - id3 tagger and renamer
id3tool - command line editor for id3 tags
id3v2 - A command line id3v2 tag editor
exfalso - audio tag editor for GTK+
extract - displays meta-data from files of arbitrary type
mp3info - An MP3 technical info viewer and ID3 1.x tag editor
mp3info-gtk - MP3 info viewer and ID3 1.x tag editor -- GTK+ version
mp3rename - Rename mp3 files based on id3tags


Espeak ‹↑›

say hello using the first male voice

 espeak -v m1 "hello, everybody"

count sheep using the default voice

 echo {1..3}" sheep" | espeak

say hello using an english accent in the 4th female accent

 espeak -v en+f4 "hello, everyone, "

say a spanish phrase with the correct accent

 espeak -v es+m4 "hola a todos "

stdin speaker via espeak

 awk '{print}' | espeak -v pt -stdin

Festival ‹↑›

say hello using festival

 echo "hello everyone " | festival --tts

say something in spanish (??? not working)

 echo "buenos dias" | festival --tts --language spanish

read aloud the contents of the file 'doc.txt'

 festival --tts doc.txt

start a festival shell


say hello from withing the festival shell

 (SayText "hello everyone")

quit the festival shell


get help for some festival commands


read the contents of the file 'words.txt' within the festival shell

 (tts "words.txt" nil)

Free Speech ‹↑›

A speech synthesis program which uses recorded voice.

Screen Readers ‹↑›

A screen reader

voice synthesis tools
espeak - small multiple languages, sounds metallic
festival -
open mind speech - uses recording
praat - ???
recite - english only text reader

Speech Recognition ‹↑›

It appears that no simple speech recognition tools are available for linux currently. The software which is availabe is designed for researchers or may be difficult to install and configure.

screader -

Internet Phone Calls ‹↑›

XMPP/Jingle and SIP are voice over ip protocols

one can use a 'sip' protocol client such as

 ekiga, twinkle, wengophone

or a proprietary application such as skype

output the microphone to a remote computers speaker (via ssh)

 dd if=/dev/dsp | ssh -c arcfour -C username@host dd of=/dev/dsp

Beeping And Synthesizing ‹↑›

an alias to ring the bell

 alias beep='echo -en "\007"'

Disable annoying sound emanations from the PC speaker

 sudo rmmod pcspkr

Alarms Etc ‹↑›

Set an alarm to wake up

 sleep 5h && rhythmbox path/to/song


reduce mp3 bit-rate (and size, as well)

 lame --mp3input -m m --resample 24 input.mp3

play the mp3 sound file 'test.mp3'

 mpg123 test.mp3
 mpg321 test.mp3   more efficient where no floating-point unit

play the MP3 stream at the url


When not recording sound, keep the inputs muted

change the sampling rate of 'old.wav' to 7,000 Hz, and write to 'new.wav'

 sox old.wav -r 7000 new.wav

convert the file 'new.wav' to the audio CD format, save in 'cd-single'

 sox new.wav -t cdr cd-single

encode an MP3 file from a WAV file called 'september-wind.wav'

 lame september-wind.wav september-wind.mp3

It usually takes some time to encode an MP3 file.

convert the MP3 file 'remix.mp3' to a WAV file `remix.wav'

 mpg321 -b 10000 -s remix.mp3 | sox -t raw -r 44100 -s -w -c 2 - remix.wav

output your microphone to a remote computer's speaker

 dd if=/dev/dsp | ssh -c arcfour -C username@host dd of=/dev/dsp

Use curl to save an MP3 stream

 curl -sS -o $outfile -m $showlengthinseconds $streamurl

generate white noise

 cat /dev/urandom > /dev/dsp

try to play a sound file to dsp, but this doesnt work

 cat /home/matth3wbishop/out003.wav > /dev/dsp

Curiosities ‹↑›

Play music from pure data

 sudo cat /usr/share/icons/*/*/* > /dev/dsp

Hear the mice moving

sphynx2 - developed at Carnegie Mellon
pocketsphynx - a 'light-weight' version of sphynx
julius -
simon - uses julius
gnome-voice-control - uses sphynx to control the 'gnome' desktop

Beep siren

 tempo=33; slope=10; maxfreq=888; function sinus { echo "s($1/$slope)*$maxfreq"|bc -l|tr -d '-'; }; for((i=1;;i++)); do beep -l$tempo -f`sinus $i`; done

Unencrypted voicechat using netcat to transfer the data

 [On PC1] nc -l -p 6666 > /dev/dsp
 [On PC2] cat /dev/dsp | nc <PC1's IP> 6666

Keep from having to adjust your volume constantly

 find . -iname \*.mp3 -print0 | xargs -0 mp3gain -krd 6 && vorbisgain -rfs .

generate noise from the computers memory

 sudo cat /dev/mem > /dev/dsp

Notes ‹↑›

this toggles mute on the Master channel of an alsa soundcard

 amixer sset Master toggle

Normalize volume in your mp3 library

 find . -type d -exec sh -c "normalize-audio -b \"{}\"/*.mp3" \;

a page about linux command line sound players
an ncurses sound player

Extracting From Cd Dvd ‹↑›

extract track 9 from an audio cd

 mplayer -fs cdda://9 -ao pcm:file=track9.wav

extract track 9 from a dvd

 mplayer -fs dvd://9 -ao pcm:file=track9.wav

These commands create uncompressed files.

Analysing ‹↑›

show information about the avi video file 'green.avi' (duration, encoding etc)

 ffmpeg -i green.avi

Video Sound ‹↑›

just listen to the sound in 'green.avi', dont watch the video

 mplayer -vo null green.avi

find out what audio format is contained in a video file 'vid.flv'

 ffmpeg -i vid.flv | grep stream

extract audio from 'eg.flv' video and encode it in the same format (mp3)

 ffmpeg -i eg.flv -vn -acodec copy eg.mp3

Convert a videos audio track to ogg vorbis.

 INPUT=<input_video> && ffmpeg -i "$INPUT" -vn -f wav - | oggenc -o ${INPUT%%.*}.ogg -

 while true; do beep -l66 -f`head -c2 /dev/input/mice|hexdump -d|awk 'NR==1{print $2%10000}'`; done

extract audio from video 'eg.avi' and stereo encode for burning to cd

 ffmpeg -i eg.avi -vn -acodec pcm_s16le -ar 44100 -ac 2 eg.wav

extract only the video from 'eg.flv' (without the audio)

 ffmpeg -i eg.flv -an -vcodec copy silent.flv

encode an mpeg video to the youtube flv format

 ffmpeg -i mov0001.mpg -ar 22050 -acodec libmp3lame -ab 32k -r 25 -s 320x240 -vcodec flv eg.flv

Script to rip the audio from the youtube video you have open

 video=$(ls /tmp | grep -e Flash\w*); ffmpeg -i /tmp/$video -f mp3 -ab 192k ~/ytaudio.mp3

Extract audio from a video

 ffmpeg -i video.avi -f mp3 audio.mp3

remove audio trac from a video file

 mencoder -ovc copy -nosound ./ -o ./

Video And Audio ‹↑›

recode 'eg.flv' with the same audio/video encoding but a different screen size

 ffmpeg -i eg.flv -acodec copy -s 320x240 -vcodec flv silent.flv

convert an mpeg video to 'flv', using only 128 seconds after 10 minutes

 ffmpeg -i eg.mpg -acodec copy -r 25 -s 320x240 -vcodec flv -ss 00:10:00 -t 128 eg.flv

encode from mpeg to flv format breaking when file size 10M is reached

 ffmpeg -i eg.mpg -acodec copy -vcodec flv -fs 10485760 eg.flv

encode 'eg.avi' movie to television mpeg suitable for pal format tv.

 ffmpeg -i eg.avi -target pal-vcd eg.mpg

extract audio and video of an flv video into separate files

 ffmpeg -i eg.flv -vcode mpeg2video eg.m2v -acodec copy eg.mp3

create a video 'new.flv' combining audio and video, offseting audio 1/2 a second

 ffmpeg -i eg.mp3 -itsoffset 00:00:00.5 -i eg.m2v new.flv

In this recipe the audio is delayed by 1/2 second

create a video 'new.flv' combining audio and video, offsetting audio 1/2 a second

 ffmpeg -i eg.m2v -itsoffset 00:00:00.5 -i eg.mp3 new.flv

In the command above the video is delayed by half a second

ffmpeg important options
-i - show information
-vn - dont use the video
-an - dont use the audio
-acodec copy - use the same audio encoding as decoding
-vcodec copy - use the same video encoding as decoding
-s 320x240 - encode video for screen size 320x240
-r 25 - encode video at 25 frames per second
-formats - show available formats

split an avi video into chunks


split mpg video into chunks


www: handbrake
a graphical tool for converting video formats

Formats ‹↑›

www: wav
www: flac
(free lossless audio codec) good lossless format for storage, more compressed than wav
www: ogg
good for streaming, lossy compression, no patent
www: mp3
a propriety compression method and format. use ogg

Flac ‹↑›

The 'flac' sound format is considered a good format for storing music since it is a non-proprietary codec which does not lose any sound information (not 'lossy'). It is not as compressed as mp3 or ogg format.

Mostly silent FLAC checking (only errors are displayed)

 flac -ts *.flac

List your FLAC albums

 find -iname '*.flac' | sed 's:/[^/]*$::' | uniq

Jargon And Technical Terms ‹↑›

This section attempts to explain some of the technical terminology used when talking about sound, sound data, and sound information when it is stored and manipulated on a computer.

www: gain
This term is similar to 'volume' although there may be a subtle distinction
www: channels
whether sound is mono, stereo or surround sound
www: sample
A sample is a unit of data contained within the sound file. Each sample pertains to information about a sound at a particular moment in time. Since computers can only store discreet values (rather than continuous data) the sound has to be broken in 'samples'. These samples are analogous to 'frames' in a video, with each frame storing one still image. The more samples per second a sound file stores the higher 'fidelity' (quality) a sound file will be.
www: herz
the number of sound data samples per second. This rate affects the quality or fidelity of the sound.
www: midi
synthesized music or sound
www: sound
file type The sound file type represents how the sound data is 'packaged' within the file. This packaging can contain 'meta-data' or information about the sound file, such as the name of the song or the singer or other more technical information.
www: codec,
sound encoding This refers to the way the actual data sampled within the sound file is compressed or otherwise encoded. There is an important distinction between the codec used in a sound file and the sound file type
www: meta-data
This is information about the sound which is contained in the sound file and may include things such as the name of the song, album, singer, or more technical information such as the way the sound file has been compressed or encoded
www: sound
compression Since sound files attempt to represent a continuous natural phenomenon with consists of complex compression waves moving through the air and falling upon human ears, the amount of data needed to represent sounds is very large. In order to combat this problem many different ways of compressing the data in sound files has been invented, the most well known of which is mp3 encoding. Other compression methods are
www: bitrate
www: proprietary
codec Since the sale of music is an extremely lucrative business and since modern music is distributed over networks with limited bandwidth, the inventors of ways to compress sound data find themselves with a very valuable asset which they attempt to protect by patenting the method of compressing or 'encoding' this sound data. Once the compression or encoding method has been patented it becomes a proprietary codec and cannot be used without paying a licencing fee to the owner. This is the case with mp3 sound files.
www: playlist
some preset targets with ffmpeg
ntsc dvd