&& Sed, The Unix Stream Editor
       ------------------------------:
       
  quote: "Should be part of every gentlemans toolbox" (me)

  The gnu sed has some features not found in other versions of sed. But many
  of the examples should work for all versions of sed. 

  Sed is actually a virtual machine, which has 2 registers, the 'work-space'
  and the 'hold-space' and a set of "instructions" or commands which
  manipulate these two registers.

  This document contains an almost complete copy of the "sed oneliners"
  document available at sed.sf.net/sed1line.html since that document is so
  complete and well compiled that there is not much point reinventing the
  wheel. I have only reformatted the document slightly so that it can be
  compiled into pdf using the scripts at bumble.sf.net/books/format-book

  @@ http://www.grymoire.com/Unix/Sed.html 
     A concise introduction
  @@ http://sed.sourceforge.net/
     The "home page" for sed.
  @@ http://sed.sourceforge.net/sedfaq.html
     comprehensive recipes
  @@ http://sed.sourceforge.net/sed1line.txt 
     a long list of one line sed scripts covering all the advanced
     techniques

BOOKS ABOUT SED

 @@ sed and awk, 2nd Edition
    Dale Dougherty and Arnold Robbins (O'Reilly, 1997; http://www.ora.com), 
 @@ UNIX Text Processing
    Dale Dougherty and Tim O'Reilly (Hayden Books, 1987)
 @@ the tutorials by Mike Arst distributed in U-SEDIT2.ZIP.
 @@ "Mastering Regular Expressions"
    Jeffrey Friedl (O'Reilly, 1997). 
    
HISTORY

  The sed stream editor has been around since the very early days of 
  unix. One of its original motivations was that the computer in
  use (pdp?) did not have the capability to load large text files 
  into memory in order to edit them "interactively".

GETTING HELP FOR SED 

  * view the man page for sed
  >> man sed

  * view the man page for the line editing program 'ed' a forerunner of 'sed'
  >> man ed

  * view information about regular expression
  >> man regexp
  
SIMPLE USAGE

 Sed takes one or more editing commands and applies all of them, in
 sequence, to each line of input. After all the commands have been applied
 to the first input line, that line is output and a second input line is
 taken for processing, and the cycle repeats. The preceding examples assume
 that input comes from the standard input device (i.e, the console,
 normally this will be piped input).  One or more filenames can be appended
 to the command line if the input does not come from stdin. Output is sent
 to stdout (the screen). Thus

 >> cat filename | sed '10q'        ##( uses piped input )
 >> sed '10q' filename              ##( same effect, avoids a useless "cat" )
 >> sed '10q' filename > newfile    ##( redirects output to disk )

 * transliterate characters
 >> sed "y/abcd/ABCD/g" 

GOTCHAS

  These are tricky things that can cause the poor uncautious sed-user
  head-scratching beffudlement and a pervading existential 'ennui'. 

  * check for 'dos' carriage return characters '\r' which are not desireable
  >> sed -n l file.txt    ##('l' prints unprintable characters visibly)
  >> sed -n 'l' file.txt  ##(the same)

  * if you saw any '\r' characters in the previous, then get rid of them
  >> fromdos file.txt   ##(those '\r' can cause annoying silent bugs)
  >> dos2unix file.txt  ##(another way to say the same thing, I think)

  * duplicate words, using groupings and backreferences.
  >> s/\([a-z]+\)/\1\1/g;
  >> s/([a-z]+)/\1\1/g;  ##(gnu sed with the -r flag, this is 'gotcha')

  * the 'e' gnu command seems to have to be the last thing on the line
  >> echo ls | "e;p"    ##(doesnt work)
  >> echo ls | "e"      ##(works)

  * the 'q' quit command can only have one address
  >> 1,4 q     ##(doesnt work)

  * if a one-line script contains a "!", it must use single quotes
  >> sed -n "/^ *#/!p"  ##(Incorrect: the bash shell uses the "!" character)
  >> sed -n '/^ *#/!p'  ##(Correct)

  * if an 'a' (append) or 'i' (insert) command has a space after the
    backslash (in any line) then the command doesnt work. This is a really 
    annoying one. use 'sed -i "s/\s\+$//" scriptname' to fix this problem
    
  * 'd' delete immediately starts a new cycle
  -------------------------------------------
    One might try not to print anything after a line containing
    "-end-" including the line itself, with the following command 
     /-end-/ { d;q }  
    BUT it doesnt work because the 'q' command will never be executed
    owing to the behavior of the 'd' delete command

  ,,,

  * when putting a ']' character in brackets, dont escape it
  >> s/\[[^]]*\]//g NOT s/\[[^\]]*\]//g  ##(at least in gnu sed )

SCRIPTS

  * run a sed script in a file
  >> sed -f script file
  >> sed --file=script file      ##(the same)
  >> cat file | sed -f script    ##(the same)

  * find out where your sed executable is
  >> which sed

  * create an interpreting script with extended regexes (-r)
  >> #!/bin/sed -rf   ##(this is the first script line, try 'which sed')
  >> ./script         ##(the script can be run with just its name)
  
SED COMMANDS

   == sed command summary
   .. d, delete the pattern space, start the next cycle
   .. p, print the pattern space
   .. q 4, quit the script with exit code '4'
   .. e 'ls', execute the shell 'ls' command and put in the pattern space
   .. e, execute the shell command in the pattern space
   .. a\, append text, each line ending in '\' except the last
   .. i\, insert text, each line ending in '\' except the last
   .. D, delete text in the pattern space up to the first newline.
   .. N, add a newline and the next line of input to the pattern space.
   .. P, print out the portion of the pattern space up to the first newline.
   .. h, replace the hold space with the pattern space.
   .. H, append a newline and the pattern space to the hold space.
   .. g, replace the pattern space with the hold space.
   .. G, append a newline and the hold space to pattern space.
   .. x, Exchange the contents of the hold and pattern spaces.
   ..,

   == modifiers
   .. \L, Turn the replacement to lowercase until a \U or \E is found,
   .. \l, Turn the next character to lowercase,
   .. \U, Turn the replacement to uppercase until a \L or \E is found,
   .. \u, Turn the next character to uppercase,
   .. \E, Stop case conversion started by \L or \U.
   ..,

 
  * add 2 lines of text at the end of the file
  --------------------------------------------
     $ a\ 
     added text\
     last line
  ,,, 
    
MULTIPLE COMMANDS

  * either use the -e switch or semi-colons ';' for multiple commands
  >> sed -e 's/a/A/' -e 's/b/B/' <textfile 
  >> sed 's/a/A/; s/b/B/' <textfile   ##(the same)

MAKING CHANGES TO FILES
 
  * change 'frog' for 'toad' and save the changes to the file 'newfile' 
  >> sed 's/frog/toad/g' < oldfile > newfile

  * This does NOT work, the file 'textfile' get truncated 
  >> sed 's/frog/toad/g' < textfile > textfile ##(use the -i switch instead)

  * change yes to no in all files ending in '.txt' and back up to .bak
  >> sed -i.bak "s/yes/no/g" *.txt    ##(gnused 4)

COMMAND LINE SWITCHES

  * print only the lines in a file which contain the word 'sky' 
  >> sed -n 's/sky/&/p' <file

  * delete the word 'tree' in the files f1, f2, and f3
  >> sed 's/tree//g' f1 f2 f3 

  * run the sed script 'script'
  >> sed -f script <textfile

  * create an interpreting script with extended regexes (-r)
  >> #!/bin/sed -rf   ##(this is the first line of the script, try 'which sed')
  >> ./script         ##(the script can be run with just its name)
  
  * create an interpreting script which doesnt print the input lines
  >> ##(!/bin/sed -nf   ##(this may not work in some versions of sed)

SUBSTITION SWITCHES

  * delete only the second match of the word 'big' on each line 
  >> sed 's/big//2' <file 

  * delete all occurrances of the word 'big' in a file
  >> sed 's/big//g' <file 

  * delete the the 2nd and following instances of the word 'cat' on each line
  >> sed 's/cat//2g' <file   ##(doesnt delete the first 'cat' on each line)

  * delete the 80th character on each line
  >> sed 's/.//80' <file   ##(the number must be less than 512)

SED COMMANDS

  * print the first 100 lines of a file
  >> sed '100q' test

INSERTING TEXT ....

  * insert 2 lines of text before the first line of the file
  ----------------------------------------------------------
    1i\
    <html> \
    <head> \

  ,,,

WRITING TO A FILE

  * write all lines which contain the word 'sky' to the file 'words'
  >> sed -n 's/sky/&/w words' < file  ##(only one space between w and 'words')

READING FROM A FILE

  Maybe the 'r' and 'e' commands cannot be used within a { } brace 
  block?

  * replace a line in a text file with the contents of another file
  >> cat index.html |  sed '/aaa/r file.txt' | less

  * replace a block of text in a file with another file
  >> cat index.html | sed '/<!-- *menu/p;/<!-- *menu/,/<!-- \/*menu/d' | sed '/<!-- *menu/r menu.html' | less
  
CASE INSENSITIVE MATCHES

  * replace 'this' or 'ThIS' (etc) with 'that' 
  >> /this/I s/this/that/i  ##(gnu sed and some others)

GNU SED

  @@ http://www.gnu.org/software/sed/manual/html_node/ gnused documentation

  * detailed information about gnu regular expressions (used in sed)
  >> man 7 regex

  * the -r switch in gnused removes the necessity for some backslashes '\' 
  >> sed -r 'b{3,}'    ##(matches 3 or more 'b's)
  >> sed 'b\{3,\}'     ##(the same, without the '-r' switch)

  * read in the result of a shell command (with the 'e' command
  >> echo ls | sed "e"   ##(prints the contents of the folder)

  * execute shell commands in 'com.txt' on lines beginning with '!'
  >> sed "/^!/ { s/!//;e }" com.txt ##(not working)
  
REGULAR EXPRESSION EXAMPLES

  == examples without the '-r' switch
  .. b\{4\}, matches 4 'b's
  .. a\{3,\}, matches 3 or more 'a's
  .. q\+, matches one or more 'q's
  .. \(big\|small\), matches 'big' or 'small'
  .. 
  ..,

EXTENDED PATTERNS

BACK REFERENCES ....
  
  'back references' are a way of 'remembering' and using what was matched.

  == refs
  .. & - everything match
  .. \1 - the first thing matched
  .. \2 - the second thing matched
  .. // - the thing that was last matched
  ..

  * duplicate lowercase letters
  >> sed 's/[a-z]/&/'

  * use a backreference
  >> sed 's/\([a-z]*\).*/\1/'

  * use several backreferences
  >> sed 's/\([a-z]*\) \([a-z]*\).*//\2\1/'

  * using the matched text in a range 
  >> echo abcd  | sed '/abc/s//x/;'  ##(prints 'xd')
  This is a trick from the days of the unix 'ed' text editor and is mentioned
  in one of Kernighans books (The Unix proggraming environment?)

  * use an implicit back reference in a line range 
  >> echo -e "22ab\n22cd"  | sed -n '/^\d\d/,//p'  
  The second empty line matcher seems to use what was already matched
  (the digits)

  * use a back reference to double the first 2 characters of each line
  >> sed -r 's/^(..)/\1\1/' file

  * duplicate words, using groupings and backreferences.
  >> s/\([a-z]+\)/\1\1/g;
  >> s/([a-z]+)/\1\1/g;  ##(gnu sed with the -r flag, this is 'gotcha')
 
  * used an implicit back reference to print lines with 'big' and delete 
  >> sed -nr '/big/s///p' file
  This is the same as 
  >> sed -nr '/big/s/big//p' file
  Since sed 'remembers what was matched

  * use the & ampersand backreference to double each line
  >> sed 's/^.*$/&&/'
  If the line has 'a tree' is will become 'a treea tree'

  * make all words 'capital case' (that is 'bumBle' becomes 'Bumble')
  >> echo MiX CAse | sed -r "s/(\w)(\w*)/\u\1\L\2/g"
  >> echo MiX CAse | sed "s/\(\w\)\(\w*\)/\u\1\L\2/g"   ##(the same)

MATCHING WORDS ....

  * convert words beginning in 'b' or 'c' to upper case
  >> sed -r "s/\<(b|c)[a-z]+/\U&/g"      

  * convert upper case words lower case
  >> sed -r "s/\<[A-Z]+\>/\L&/g"  ##(gnused, not very international)

  * match word boundaries, words beginning in 'a' or 'b'
  >> sed -r "s/\<(a|b)[a-z]+/&/g"    

ALTERNATION ....

  * delete the words red, green, blue 
  >> sed 's/red\|green\|blue//g'        ##(gnu sed)
  >> sed -r 's/red|green|blue//g'       ##(the -r switch changes the syntax)
  >> sed 's/red//g;s/green//g;s/blue//g ##(other versions of sed)

INTERNATIONAL CHARACTER CLASSES ....

  * delete uppercase letters or the letter 'a', 'b' 
  >> sed -r 's/[[:upper:]ab]//g'

  == gnused character classes (for international text)
  .. [[:alnum:]], [A-Za-z0-9]     Alphanumeric characters
  .. [[:alpha:]], [A-Za-z]        Alphabetic characters
  .. [[:blank:]], [ \x09]         Space or tab characters only
  .. [[:cntrl:]], [\x00-\x19\x7F] Control characters
  .. [[:digit:]], [0-9]           Numeric characters
  .. [[:graph:]], [!-~]           Printable and visible characters
  .. [[:lower:]], [a-z]           Lower-case alphabetic characters
  .. [[:print:]], [ -~]           Printable (non-Control) characters
  .. [[:punct:]], [!-/:-@[-`{-~]  Punctuation characters
  .. [[:space:]], [ \t\v\f]       All whitespace chars
  .. [[:upper:]], [A-Z]           Upper-case alphabetic characters
  .. [[:xdigit:]], [0-9a-fA-F]     Hexadecimal digit characters
  ..


  == the special character classes in short form
  .. \s - space characters
  .. \w - a word character
  .. and others ...
  ..
 
UPPERCASE AND LOWERCASE

  The following may only apply to gnu sed.

  * make the text 'MiXeD' all lower case
  >> echo "MiXeD" | sed 's/.*/\L&/g'

  * turn to lower case all lines beginning with a '+' plus sign
  >> echo -e " +MiX\n+CaSe" | sed '/^ *+/s/.*/\L&/g'
  >> sed '/^\s*+/s/.*/\L&/g'       ##(the same, with some 'seds' like gnu)
  >> sed '/^[[:space:]]*+/s/.*/\L&/g'   ##(the same, again for gnused)  )

  * make only the 1st word in the line lower case
  >> echo ONE TWO | sed "s/\w\+/\L&/"   

  * make all words in the line lower case  (so 'TrEE' becomes 'tree')
  >> sed "s/\w\+/\L&/g"
  >> sed -r "s/\w+/\L&/g" ##(the same)
  >> sed -r "s/\w/\l&/g"  ##(the same)

  * make all words 'capital case' (that is 'bumBle' becomes 'Bumble')
  >> echo MiX CAse | sed -r "s/(\w)(\w*)/\u\1\L\2/g"
  >> echo MiX CAse | sed "s/\(\w\)\(\w*\)/\u\1\L\2/g"   ##(the same)

SED ONE LINE SCRIPTS

 The following is a slight reformatting of the one line sed script document
 by Eric Pement - pemente[at]northpark[dot]edu (version 5.5 Dec. 29, 2005)

 Latest version of the original document (in English) is usually at:
   http://sed.sourceforge.net/sed1line.txt
   http://www.pement.org/sed/sed1line.txt

WHITE SPACE

  * 'squeeze' multiple spaces into one on every line in the file
  >> sed "s/ \+/ /g" doc.txt     ##(only spaces, not tab characters ...)
  >> sed -r "s/ +/ /g" doc.txt   ##(exactly the same)

  * squeeze all space characters (including tabs) into one space character
  >> sed "s/\s\+/ /g" doc.txt    ##(not all seds)  )
  >> sed -r "s/\s+/ /g" doc.txt  ##(the same, gnu sed)
  >> sed -r "s/[[:space:]]+/ /g" doc.txt  ##(the same, gnu sed)
  >> sed -r "s/[[:blank:]]+/ /g" doc.txt  ##(almost the same, gnu sed)

FILE SPACING

 * double space a file
 >> sed G

 * double space a file which already has blank lines in it. Output file
 * should contain no more than one blank line between lines of text.
 >> sed '/^$/d;G'

 * triple space a file
 >> sed 'G;G'

 * undo double-spacing (assumes even-numbered lines are always blank)
 >> sed 'n;d'

 * insert a blank line above every line which matches "regex"
 >> sed '/regex/{x;p;x;}'

 * insert a blank line below every line which matches "regex"
 >> sed '/regex/G'

 * insert a blank line above and below every line which matches "regex"
 >> sed '/regex/{x;p;x;G;}'

NUMBERING

 * number each line of a file (simple left alignment). Using a tab (see
 * note on '\t' at end of file) instead of space will preserve margins.
 >> sed = filename | sed 'N;s/\n/\t/'

 * number each line of a file (number on left, right-aligned)
 >> sed = filename | sed 'N; s/^/     /; s/ *\(.\{6,\}\)\n/\1  /'

 * number each line of file, but only print numbers if line is not blank
 >> sed '/./=' filename | sed '/./N; s/\n/ /'

 * count lines (emulates "wc -l")
 >> sed -n '$='

TEXT CONVERSION AND SUBSTITUTION

 * IN UNIX ENVIRONMENT: convert DOS newlines (CR/LF) to Unix format.
 >> sed 's/.$//'               ##( assumes that all lines end with CR/LF )
 >> sed 's/^M$//'              ##( in bash/tcsh, press Ctrl-V then Ctrl-M )
 >> sed 's/\x0D$//'            ##( works on ssed, gsed 3.02.80 or higher )

 * IN UNIX ENVIRONMENT: convert Unix newlines (LF) to DOS format.
 >> sed "s/$/`echo -e \\\r`/"            ##( command line under ksh )
 >> sed 's/$'"/`echo \\\r`/"             ##( command line under bash )
 >> sed "s/$/`echo \\\r`/"               ##( command line under zsh )
 >> sed 's/$/\r/'                        ##( gsed 3.02.80 or higher )

 * IN DOS ENVIRONMENT: convert Unix newlines (LF) to DOS format.
 >> sed "s/$//"                          ##( method 1 )
 >> sed -n p                             ##( method 2 )

 * IN DOS ENVIRONMENT: convert DOS newlines (CR/LF) to Unix format.
 * Can only be done with UnxUtils sed, version 4.0.7 or higher. The
 * UnxUtils version can be identified by the custom "--text" switch
 * which appears when you use the "--help" switch. Otherwise, changing
 * DOS newlines to Unix newlines cannot be done with sed in a DOS
 * environment. Use "tr" instead.
 >> sed "s/\r//" infile >outfile         ##( UnxUtils sed v4.0.7 or higher)
 >> tr -d \r <infile >outfile            ##( GNU tr version 1.22 or higher )

 * delete leading whitespace (spaces, tabs) from front of each line
 * aligns all text flush left
 >> sed 's/^[ \t]*//'                    ##( see note on '\t' at end of file )

 * delete trailing whitespace (spaces, tabs) from end of each line
 >> sed 's/[ \t]*$//'                    ##( see note on '\t' at end of file )

 * delete BOTH leading and trailing whitespace from each line
 >> sed 's/^[ \t]*//;s/[ \t]*$//'

 * insert 5 blank spaces at beginning of each line (make page offset)
 >> sed 's/^/     /'

 * align all text flush right on a 79-column width
 >> sed -e :a -e 's/^.\{1,78\}$/ &/;ta'  ##( set at 78 plus 1 space )

 * center all text in the middle of 79-column width. In method 1,
 * spaces at the beginning of the line are significant, and trailing
 * spaces are appended at the end of the line. In method 2, spaces at
 * the beginning of the line are discarded in centering the line, and
 * no trailing spaces appear at the end of lines.
 >> sed  -e :a -e 's/^.\{1,77\}$/ & /;ta'                     ##( method 1 )
 >> sed  -e :a -e 's/^.\{1,77\}$/ &/;ta' -e 's/\( *\)\1/\1/'  ##( method 2 )

 * substitute (find and replace) "foo" with "bar" on each line
 >> sed 's/foo/bar/'             ##( replaces only 1st instance in a line )
 >> sed 's/foo/bar/4'            ##( replaces only 4th instance in a line )
 >> sed 's/foo/bar/g'            ##( replaces ALL instances in a line )
 >> sed 's/\(.*\)foo\(.*foo\)/\1bar\2/' ##( replace the next-to-last case )
 >> sed 's/\(.*\)foo/\1bar/'            ##( replace only the last case )

 * substitute "foo" with "bar" ONLY for lines which contain "baz"
 >> sed '/baz/s/foo/bar/g'

 * substitute "foo" with "bar" EXCEPT for lines which contain "baz"
 >> sed '/baz/!s/foo/bar/g'

 * change "scarlet" or "ruby" or "puce" to "red"
 >> sed 's/scarlet/red/g;s/ruby/red/g;s/puce/red/g'   ##( most seds )
 >> gsed 's/scarlet\|ruby\|puce/red/g'                ##( GNU sed only )

 * reverse order of lines (emulates "tac")
 * bug/feature in HHsed v1.5 causes blank lines to be deleted
 >> sed '1!G;h;$!d'               ##( method 1 )
 >> sed -n '1!G;h;$p'             ##( method 2 )

 * reverse each character on the line (emulates "rev")
 >> sed '/\n/!G;s/\(.\)\(.*\n\)/&\2\1/;//D;s/.//'

 * join pairs of lines side-by-side (like "paste")
 >> sed '$!N;s/\n/ /'

 * if a line ends with a backslash, append the next line to it
 >> sed -e :a -e '/\\$/N; s/\\\n//; ta'

 * if a line begins with an equal sign, append it to the previous line
 * and replace the "=" with a single space
 >> sed -e :a -e '$!N;s/\n=/ /;ta' -e 'P;D'

 * add commas to numeric strings, changing "1234567" to "1,234,567"
 >> gsed ':a;s/\B[0-9]\{3\}\>/,&/;ta'                     ##( GNU sed )
 >> sed -e :a -e 's/\(.*[0-9]\)\([0-9]\{3\}\)/\1,\2/;ta'  ##( other seds )

 * add commas to numbers with decimal points and minus signs (GNU sed)
 >> gsed -r ':a;s/(^|[^0-9.])([0-9]+)([0-9]{3})/\1\2,\3/g;ta'

 * add a blank line every 5 lines (after lines 5, 10, 15, 20, etc.)
 >> gsed '0~5G'                  ##( GNU sed only )
 >> sed 'n;n;n;n;G;'             ##( other seds )

SELECTIVE PRINTING OF CERTAIN LINES

 * print first 10 lines of file (emulates behavior of "head")
 >> sed 10q

 * print first line of file (emulates "head -1")
 >> sed q

 * print the last 10 lines of a file (emulates "tail")
 >> sed -e :a -e '$q;N;11,$D;ba'

 * print the last 2 lines of a file (emulates "tail -2")
 >> sed '$!N;$!D'

 * print the last line of a file (emulates "tail -1")
 >> sed '$!d'                    ##( method 1 )
 >> sed -n '$p'                  ##( method 2 )

 * print the next-to-the-last line of a file
 >> sed -e '$!{h;d;}' -e x              ##( for 1-line files, print blank line )
 >> sed -e '1{$q;}' -e '$!{h;d;}' -e x  ##( for 1-line files, print the line )
 >> sed -e '1{$d;}' -e '$!{h;d;}' -e x  ##( for 1-line files, print nothing )

 * print only lines which match regular expression (emulates "grep")
 >> sed -n '/regexp/p'           ##( method 1 )
 >> sed '/regexp/!d'             ##( method 2 )

 * print only lines which do NOT match regexp (emulates "grep -v")
 >> sed -n '/regexp/!p'          ##( method 1, corresponds to above )
 >> sed '/regexp/d'              ##( method 2, simpler syntax )

 * print the line immediately before a regexp, but not the line
 * containing the regexp
 >> sed -n '/regexp/{g;1!p;};h'

 * print the line immediately after a regexp, but not the line
 * containing the regexp
 >> sed -n '/regexp/{n;p;}'

 * print 1 line of context before and after regexp, with line number
 * indicating where the regexp occurred (similar to "grep -A1 -B1")
 >> sed -n -e '/regexp/{=;x;1!p;g;$!N;p;D;}' -e h

 * grep for AAA and BBB and CCC (in any order)
 >> sed '/AAA/!d; /BBB/!d; /CCC/!d'

 * grep for AAA and BBB and CCC (in that order)
 >> sed '/AAA.*BBB.*CCC/!d'

 * grep for AAA or BBB or CCC (emulates "egrep")
 >> sed -e '/AAA/b' -e '/BBB/b' -e '/CCC/b' -e d    ##( most seds )
 >> gsed '/AAA\|BBB\|CCC/!d'                        ##( GNU sed only )

 * print paragraph if it contains AAA (blank lines separate paragraphs)
 * HHsed v1.5 must insert a 'G;' after 'x;' in the next 3 scripts below
 >> sed -e '/./{H;$!d;}' -e 'x;/AAA/!d;'

 * print paragraph if it contains AAA and BBB and CCC (in any order)
 >> sed -e '/./{H;$!d;}' -e 'x;/AAA/!d;/BBB/!d;/CCC/!d'

 * print paragraph if it contains AAA or BBB or CCC
 >> sed -e '/./{H;$!d;}' -e 'x;/AAA/b' -e '/BBB/b' -e '/CCC/b' -e d
 >> gsed '/./{H;$!d;};x;/AAA\|BBB\|CCC/b;d'         ##( GNU sed only )

 * print only lines of 65 characters or longer
 >> sed -n '/^.\{65\}/p'

 * print only lines of less than 65 characters
 >> sed -n '/^.\{65\}/!p'        ##( method 1, corresponds to above )
 >> sed '/^.\{65\}/d'            ##( method 2, simpler syntax )

 * print section of file from regular expression to end of file
 >> sed -n '/regexp/,$p'

 * print section of file based on line numbers (lines 8-12, inclusive)
 >> sed -n '8,12p'      
 >> sed '8,12!d'             ##(the same)

 * print line number 52
 >> sed -n '52p'                
 >> sed '52!d'               ##(another way to do it )
 >> sed '52q;d'              ##(another way, quicker for large files )

 * beginning at line 3, print every 7th line
 >> gsed -n '3~7p'               ##( GNU sed only )
 >> sed -n '3,${p;n;n;n;n;n;n;}' ##( other seds )

 * print section of file between two regular expressions (inclusive)
 >> sed -n '/Iowa/,/Montana/p'             ##( case sensitive )

SELECTIVE DELETION OF CERTAIN LINES

 * print all of file EXCEPT section between 2 regular expressions
 >> sed '/Iowa/,/Montana/d'

 * delete duplicate, consecutive lines from a file (emulates "uniq").
 * First line in a set of duplicate lines is kept, rest are deleted.
 >> sed '$!N; /^\(.*\)\n\1$/!P; D'

 * delete duplicate, nonconsecutive lines from a file. Beware not to
 * overflow the buffer size of the hold space, or else use GNU sed.
 >> sed -n 'G; s/\n/&&/; /^\([ -~]*\n\).*\n\1/d; s/\n//; h; P'

 * delete all lines except duplicate lines (emulates "uniq -d").
 >> sed '$!N; s/^\(.*\)\n\1$/\1/; t; D'

 * delete the first 10 lines of a file
 >> sed '1,10d'

 * delete the last line of a file
 >> sed '$d'

 * delete the last 2 lines of a file
 >> sed 'N;$!P;$!D;$d'

 * delete the last 10 lines of a file
 >> sed -e :a -e '$d;N;2,10ba' -e 'P;D'   ##( method 1 )
 >> sed -n -e :a -e '1,10!{P;N;D;};N;ba'  ##( method 2 )

 * delete every 8th line
 >> gsed '0~8d'                           ##( GNU sed only )
 >> sed 'n;n;n;n;n;n;n;d;'                ##( other seds )

 * delete lines matching pattern
 >> sed '/pattern/d'

 * delete ALL blank lines from a file (same as "grep '.' ")
 >> sed '/^$/d'                           ##( method 1 )
 >> sed '/./!d'                           ##( method 2 )

 * delete all CONSECUTIVE blank lines from file except the first; also
 * deletes all blank lines from top and end of file (emulates "cat -s")
 >> sed '/./,/^$/!d'          ##( method 1, allows 0 blanks at top, 1 at EOF )
 >> sed '/^$/N;/\n$/D'        ##( method 2, allows 1 blank at top, 0 at EOF )

 * delete all CONSECUTIVE blank lines from file except the first 2
 >> sed '/^$/N;/\n$/N;//D'

 * delete all leading blank lines at top of file
 >> sed '/./,$!d'

 * delete all trailing blank lines at end of file
 >> sed -e :a -e '/^\n*$/{$d;N;ba' -e '}'  ##( works on all seds )
 >> sed -e :a -e '/^\n*$/N;/\n$/ba'        ##( ditto, except for gsed 3.02.* )

 * delete the last line of each paragraph
 >> sed -n '/^$/{p;h;};/./{x;/./p;}'

SPECIAL APPLICATIONS

UNIX MAN PAGES ....

 * remove nroff overstrikes (char, backspace) from man pages. The 'echo'
 * command may need an -e switch if you use Unix System V or bash shell.
 >> sed "s/.`echo \\\b`//g" ##(double quotes needed for Unix environment)
 >> sed 's/.^H//g'          ##(in bash/tcsh, press Ctrl-V and then Ctrl-H)
 >> sed 's/.\x08//g'        ##(hex expression for sed 1.5, GNU sed, ssed)

EMAIL MESSAGES ....

 * get Usenet/e-mail message header
 >> sed '/^$/q'             ##( deletes everything after first blank line )

 * get Usenet/e-mail message body
 >> sed '1,/^$/d'           ##( deletes everything up to first blank line )

 * get the subject header, but remove initial "Subject: " portion
 >> sed '/^Subject: */!d; s///;q'

 * extract an email return address header
 >> sed '/^Reply-To:/q; /^From:/h; /./d;g;q'

 * parse out the address proper. Pulls out the e-mail address by itself
 * from the 1-line return address header (see preceding script)
 >> sed 's/ *(.*)//; s/>.*//; s/.*[:<] *//'

 * add a leading angle bracket and space to each line (quote a message)
 >> sed 's/^/> /'

 * delete leading angle bracket & space from each line (unquote a message)
 >> sed 's/^> //'

 * remove most HTML tags (accommodates multiple-line tags)
 >> sed -e :a -e 's/<[^>]*>//g;/</N;//ba'

 * extract multi-part uuencoded binaries, removing extraneous header
 * info, so that only the uuencoded portion remains. Files passed to
 * sed must be passed in the proper order. Version 1 can be entered
 * from the command line; version 2 can be made into an executable
 * Unix shell script. (Modified from a script by Rahul Dhesi.)
 >> sed '/^end/,/^begin/d' file1 file2 ... fileX | uudecode   ##( vers. 1 )
 >> sed '/^end/,/^begin/d' "$@" | uudecode                    ##( vers. 2 )

 * sort paragraphs of file alphabetically. Paragraphs are separated by blank
 * lines. GNU sed uses \v for vertical tab, or any unique char will do.
 >> sed '/./{H;d;};x;s/\n/={NL}=/g' file | sort | sed '1s/={NL}=//;s/={NL}=/\n/g'
 >> gsed '/./{H;d};x;y/\n/\v/' file | sort | sed '1s/\v//;y/\v/\n/'

 * zip up each .TXT file individually, deleting the source file and
 * setting the name of each .ZIP file to the basename of the .TXT file
 * (under DOS: the "dir /b" switch returns bare filenames in all caps).
 echo @echo off >zipup.bat
 dir /b *.txt | sed "s/^\(.*\)\.TXT/pkzip -mo \1 \1.TXT/" >>zipup.bat

QUOTING SYNTAX

 The preceding examples use single quotes ('...') instead of double quotes
 ("...") to enclose editing commands, since sed is typically used on a Unix
 platform. Single quotes prevent the Unix shell from intrepreting the dollar
 sign ($) and backquotes (`...`), which are expanded by the shell if they are
 enclosed in double quotes. Users of the "csh" shell and derivatives will also
 need to quote the exclamation mark (!) with the backslash (i.e., \!) to
 properly run the examples listed above, even within single quotes.  Versions of
 >> sed written for DOS invariably require double quotes ("...") instead of single
 quotes to enclose editing commands.

USE OF TAB IN SED SCRIPTS

 This section deals with the use of the \t sequence in sed scripts
 For clarity in documentation, we have used the expression '\t' to indicate a
 tab character (0x09) in the scripts.  However, most versions of sed do not
 recognize the '\t' abbreviation, so when typing these scripts from the command
 line, you should press the TAB key instead. '\t' is supported as a regular
 expression metacharacter in awk, perl, and HHsed, sedmod, and GNU sed v3.02.80.

VERSIONS OF SED

 Versions of sed do differ, and some slight syntax variation is to be
 expected.  In particular, most do not support the use of labels (:name) or
 branch instructions (b,t) within editing commands, except at the end of
 those commands. We have used the syntax which will be portable to most
 users of sed, even though the popular GNU versions of sed allow a more
 succinct syntax. When the reader sees a fairly long command such as this

   >> sed -e '/AAA/b' -e '/BBB/b' -e '/CCC/b' -e d

 This can be reduced in Gnu Sed to:

   >> sed '/AAA/b;/BBB/b;/CCC/b;d'      ##( or even )
   >> sed '/AAA\|BBB\|CCC/b;d'

 In addition, remember that while many versions of sed accept a command like
 "/one/ s/RE1/RE2/", some do NOT allow "/one/! s/RE1/RE2/", which contains space
 before the 's'. Omit the space when typing the command.

FREEBSD OR MACOSX SED ....

  The version of sed used on Mac OSX is the 'freebsd' 
  (Free Berkeley Standard Distribution Unix) version. This has some important differences
  from the gnu sed which is normally used on linux and microsoft windows computers.

  * a colon must preceed a closing brace on the same line
  >> echo test | sed '/t/{p}'     ##(No!! this doesnt work)
  >> echo test | sed '/t/{p;}'    ##(correct)

  * case specifiers in the replacement dont seem to work
  >> echo test | sed 's/t/\U/'    ##(No!! doesnt work)

  * the gnu -r option for extended patterns is the -E option
  >> echo test | sed -r 's/^(.)/\1/'    ##(No!! doesnt work)
  >> echo test | sed -E 's/^(.)/\1/'    ##(Correct)

OPTIMIZING FOR SPEED

  If execution speed needs to be increased (due to large input files or
  slow processors or hard disks), substitution will be executed more
  quickly if the "find" expression is specified before giving the
  "s/.../.../" instruction.  Thus:

   >> sed 's/foo/bar/g' filename         ##( standard replace command )
   >> sed '/foo/ s/foo/bar/g' filename   ##( executes more quickly )
   >> sed '/foo/ s//bar/g' filename      ##( shorthand sed syntax )

 On line selection or deletion in which you only need to output lines from
 the first part of the file, a "quit" command (q) in the script will
 drastically reduce processing time for large files. Thus:

   >> sed -n '45,50p' filename        ##( print line nos. 45-50 of a file )
   >> sed -n '51q;45,50p' filename    ##( same, but executes much faster )

 == Sed one line contributors
 .. Al Aab - founder of "seders" list
 .. Edgar Allen - various
 .. Yiorgos Adamopoulos - various
 .. Dale Dougherty - author of "sed & awk"
 .. Carlos Duarte - author of "do it with sed"
 .. Eric Pement - author of this document
 .. Ken Pizzini - author of GNU sed v3.02
 .. S.G. Ravenhall - great de-html script
 .. Greg Ubben - many contributions & much help
 ..

DOCUMENT-NOTES:
 
  # this section contains information about the document and
  # will not normally be printed.

  # A small (16x16) icon image to identify the book
  document-icon:

  # A larger image to identify or illustrate the title page
  document-image:

  # what sort of document is this
  document-type: book

  # in what kind of state (good or bad) is this document 
  document-quality: not too bad

  # work carried out
  last-revision:

  document-history:
  @@ 2009 
     book begun
  @@ 13 april 2010
     added a little information about freebsd sed which is
     used on Mac computers.
  @@ 25 march 2011
     A few back reference notes

  # who wrote this
  authors: mjbishop at fastmail dot fm

  # a short description of the contents, possible used for doc lists
  short-description: Advanced usage of the sed stream editor

  # the main programming language (if any) for the code examples
  code-language: bash

  # the script which will be used to produce html (a webpage)
  make-html: ./book-html.sh
  # the script which will produce 'LaTeX' output (for printing, pdf etc)
  make-latex:

 
NOTES 

  * Apply substitution only on the line following a marker
  >> sed '/MARKER/{N;s/THIS/THAT/}'

  * Remove/replace newline characters.
  >> sed ':a;N;$!ba;s/\n/ /g'

  * remove leading blank lines
  >> sed '/./,$!d'

  * Efficiently extract lines between markers
  >> sed -n '/START/,${/STOP/q;p}'
 
  * Simple XML tag extract with sed
  >> sed -n 's/.*<foo>\([^<]*\)<\/foo>.*/\1/p'
 
  * create an interpreting script with extended regexes (-r)
  >> #!/bin/sed -rf   ##(this is the first line of the script, try 'which sed')
 
  * convert upper case words to lower case
  >> sed -r "s/\<[A-Z]+\>/\L&/g"       ##(gnused, this is not very international) 
  >> sed -r "s/\<[[:upper:]]+\>/\L&/g" ##(the same, but more international)