The Perl Programming Language

Table of Contents

last revision
27 October 2011, 6:33pm
book quality
just begun, poor

The book is set out as a series of "recipes" in the style of a "cookbook" The perl language and its many modules is a large topic and this document has only been just begun.

Perl ‹↑›

[+] Perl is a language which was originally inspired by the Bash shell syntax, as well as by the idea of writing terse but powerful programs. The name perl is not an acronym, since the creator, Larry Wall said he was looking for any name with "positive connotations". Perl initially rose to fame through it suitability for writing web-server cgi scripts, since perl, like the unix shells, uses plain text as a kind of "data interchange format". [+] The weaknesses of perl: it moves away from the Unix idea of using small programs to do one thing and linking them together with FIFO pipes or streams; it has no built in windowing commands; Its ease of use may encourage bad programming or attract bad programmers.

This section will concentrate on one line perl programs and integrating those programs with the bash shell.

Learning Perl ‹↑›

display the introduction to the perl documentation
 man perl
 perldoc perl    the same

set variables f="bi" and g="sm a" using eval and a perl one liner.

 eval $(perl -e 'print "f=bi;";print "g=\"sm a\"\n"')
this demonstrates "exporting" variables from perl to the parent shell

check syntax of a perl one line program but dont run it

 perl -wc -e 'print "\n\n"'

Gotchas ‹↑›

in a one-line script, print must use "\n" otherwise no output may appear

 perl -e 'if ( -d ".") { print "folder"; }' doesnt seem to print 'folder'
 perl -e 'if ( -d ".") { print "folder\n"; }' correct: prints 'folder'

most one line perl scripts should be enclosed in single quotes

 perl -e 'if (!$s) { print "s has no value\n" }'
if double quotes were used, the shell would interpret "!$" first

lines read from standard input have a newline, so you must use '-l' or 'chomp'

 ls | perl -ne 'if (-T "$_") {print "$_ text"}' ##(doesnt work since $_ has \n)
 ls | perl -lne 'if (-T "$_") {print "$_ text"}' | less         this works
 ls | perl -ne 'chomp; if (-T "$_") {print "$_ text\n"}' | less this works

Perl Documentation ‹↑›

Documentation for the perl language is installed along with the language

view the start page for the perl documentation

 man perl
 perldoc perl   the same, more useful on ms windows

query the perl "frequently asked questions" documents for a word or phrase

 perldoc -q eval

save the entire "perlfunc" man page in the file "file.txt"

 perldoc -T perlfunc > file.txt

query the perl faqs for the word 'file'

 perldoc -q ' file'   ' file' seems to work better than 'file'

show the documentation for the CGI module

 perldoc CGI   these names are case-sensitive, "perldoc cgi" doesnt work
 man CGI

Pod The Perl Documentation Format ‹↑›

debian: perl-doc - to use the perldoc tool

The perl documentation format is known as the "pod" format and is accompanied by a variety of tools to transform it to other documentation formats. The "perldoc" tool can be used to query the perl documentation or the documentation for a module

Perl Modules ‹↑›

Modules are libraries of code which carry out specific task and save the programmer large amounts of time. One of the strenghts of perl is the very large number of open-source modules available.

some perl documentation pages
man perl - the start page contains references to lots and lots of docs
man perlintro - an introduction
man perltoc - a table of contents of perl documentation

show all programs which have perl in the short description

 apt-cache search 'perl' | grep perl | sort | less

use a module

 perl -Mmodule -e "print 'hello'"

use the CGI module with the "qw(:standard)" option

  perl -MCGI=:standard -e "print header, h1('hello')"

the cpan program can be used to find and install perl modules


some perl modules may be installed via debian packages or via cpan

 sudo apt-get install libgd-barcode-perl

check if the LWP module is installed

 perl -MLWP -e1

print the version number of the LWP module

 perl -MLWP -e 'print $LWP::VERSION'

Using The Lwp Module ‹↑›

a set of "recipes" for using the lwp module
download a webpage for processing
 perl -e 'use LWP::Simple; $doc = get "";'

download and display a url using the lwp perl module

 PERL -MLWP::Simple -e 'getprint "http://url"'

check if a document exists

  use LWP::Simple; if (head($url)) {# ok document exists}

Using The Cgi Module ‹↑›

send error messages generated by perl to the browser

 use CGI;
 use CGI::Carp qw(fatalsToBrowser);

print a very simple document with the cgi module

     use CGI;
     my $cgi = new CGI;
     print $cgi->header(); print $cgi->start_html();
     print "hello cgi"; print $cgi->end_html; 

indicate the title of the document using the cgi module

 print $cgi->start_html( -title=> "testing cgi")

show the file parameter which was sent from an html form

     use CGI; 
     my $cgi = new CGI;
     print $cgi->header(); print $cgi->start_html();
     print "the file parameter is:", $cgi->param('file');
     print $cgi->end_html; 

access the cgi "environment" variables from perl

 $sDocumentRoot = $ENV{'DOCUMENT_ROOT'};

Some Useful Modules ‹↑›

perl modules documentation
man perlmod - how modules work
man perlmodlib - how to write and use a perl module
man perlmodinstall - how to install from CPAN

Windowing Programs ‹↑›

 use tk; # #

Cpan The Online Perl Code Repository ‹↑›

Cpan stands for "comprehensive perl archive network" and is a repository of open-source code modules and libraries which can be used to ease the task of the programmer. "cpan" is also an interactive program which allows one to find and download these modules from a command line. The name "cpan" was modelled on "ctan" which is the the "comprehensive tex archive network"

the cpan home site
some information about cpan
run the "cpan" interactive program
 sudo cpan   as a quick fix for permissions problems

show the documentation for the "cpan" module

 perldoc CPAN  the same, in a web-browser

"cpanplus" is a more modern alternative to cpan

install the latest version of cpan, with passive ftp for firewalls

 perl -MCPAN -e '$ENV{FTP_PASSIVE} = 1; install CPAN'
 install CPAN the same, but from within the "cpan" program

search the cpan site for the documentation for the "LWP::UserAgent" package

Using The Cpan Program ‹↑›

problems with the "cpan" program: 00- it doesnt tell you how big a module which you are going to install is. -

run the start up configuration for the cpan program

    o conf init

install "history" support for cpan (the up arrow obtain the previous command)

    install Term::ReadKey Term::ReadLine  ##(didnt work)

show the short help for the cpan program

 h | less

show details about the module whose name is CGI

 m CGI     the exact module name must be written, case sensitive

show all modules which have the text "CGI" in their names

 m /CGI/ | less   this is a case insensitive search
problem, hitting 'q' in less exits cpan

show information about the CGI module (using CPAN non-interactively)

 perl -MCPAN -e' CPAN::Shell->m("CGI")' | less

show a short description for all modules which have "CGI" in the name

 perl -MCPAN -e' CPAN::Shell->m("/cgi/")' | less  this is quite slow
these searches are Not case sensitive

show all available modules on cpan (approximately 70000)

 perl -MCPAN -e' CPAN::Shell->m()' | less  this will be VERY slow
this command took 3 minutes on my ASUS netbook computer

Perl One Line Scripts ‹↑›

some useful modules
HTML::LinkExtor - extract links from html
File::Find - find files
Getopt::Long - get long and short options for a script
Cwd - print the current working folder
URI::URL - extract portions of a url
File::Basename - get the folder and filename
File::Path - make folders and delete them (mkpath rmtree)
Benchmark - time how long perl code takes to run
DataDumper - creates a string representation of arrays and hashes

print files in the current folder which are text files all the following versions do the same thing

 ls | perl -lne '-T and print'  a possible problem with spacey filenames
 ls | perl -lne '-T && print'
 ls | perl -lne 'print if -T'
 ls | perl -lne '-T "$_" and print'
 ls | perl -lne 'if (-T "$_") {print "$_"}'
 ls | perl -lne '-T "$_" and print "$_"'
 ls | perl -lne '-T $_ and print $_'
 ls | perl -lne '(-T "$_") && (print "$_")'
 ls | perl -ne 'chomp; if (-T "$_") {print "$_\n"}'

# The large list of commands above, all of which do the same thing # shows the flexibility of the perl syntax. Perl allows certain things to # be implied (just like in real language). The most common thing which # is implied is "$_" which can be translated as "that" and in a loop # is generally the current line or variable.

include 2 perl expressions with the -e expression

 perl -e 'print "Hello";' -e 'print " World\n"'

print the 2nd field of the input (fields delimited by spaces)

 echo a b c | perl -lane 'print $F[1]' the -n switch loops without printing

print the 1st and 2nd fields of the input lines

 echo a b c | perl -lane 'print "@F[0..1]"'

print the first field of a password file (splitting on the ':' f character

 perl -F: -lane 'print $F[0] if !/^#/' /etc/passwd

print lines which dont contain the letter 'b'

 (echo a; echo b) | perl -nle 'print if !/b/'

print the 3rd line of a file

perl command line switches
-p - loops over each input line and prints it
-n - loops over each input line but doesnt print it
-l - remove newline characters when read and restore when writing
-e - specify a perl expression to use should be the last switch used

print everyline except the first

 perl -nle 'print if $. == 1' file.txt

Printing To Standard Output ‹↑›

print the results of 2 functions to standard output

 print header(), footer();
 print header, footer;     the same

print a string and a function result to standard output

 print "Your name is", name();

print text in single quotes

 perl -e 'print q{#!/usr/bin/perl}' the quotes dont appear?

General Perl Syntax ‹↑›

Loops ‹↑›

The Foreach Loop ‹↑›

loop through the elements of an array

    @names = ('Larry', 'John', 'Jack');
    foreach (@names) { print $_."\n"; }

loop through the elements of a literal list

 foreach (qw/one 2 three 4/) { print $_."\n"; }

If Statement ‹↑›

the if statement has a c-like syntax

 if (test) {... }

Global Variables ‹↑›

import the variables $TRUE, $FALSE etc

 do ''; use vars qw($TRUE $FALSE $LANGUAGE);

Different Types Of Quotes ‹↑›

 perl -nle 'print unless $. == 1' file.txt

Local Variables ‹↑›

create a local variable

 my $string;

Array Variables ‹↑›

assign a list of text files in the current folder to the array @list

 @list = grep { -f && -T } glob(’*’)

show all text files in the current folder

 perl -e 'print join "\n", grep {-T}<*>'

String Variables ‹↑›

exit if the "s" variable has no value

 perl -e 'if (!$s) { die "s has no value"; }'

join to strings together

 $s = "green"."tree";   $s is now 'greentree'

append a 'here document' to a string

    $s .= <<ENDS;
    A multiline
    string variable

show the number of occurences of the 's' character in the variable "$text"

 $i = ($text =~ tr/s//)

Here Documents ‹↑›

[+] A 'here document' is a way to print a large amount of text, or two assign that text to a variable without having to use lots of quote characters, or escape special characters. The syntax of the 'here document' was based on (but is slightly different to) the syntax of the Bash shell equivalent.

assign a here document to a string

    $s = <<ENDS;
    A multiline
    string variable

Matching Text With Regular Expressions ‹↑›

quote characters
qq - be used anywhere " can be used
qw - quote a list of words eg; qw/one 2 three/

print file names in the current folder which have 'tree' in the name

 ls | perl -lne 'print if /tree/'
 ls | perl -lne 'print if $_ =~ /tree/'        the same but unnecessary
 ls | perl -lne 'print($_) if ($_ =~ /tree/)'  the same again

check if something doesnt match

 "Hello World" !~ /World/

String Substitutions ‹↑›

This section deals with replacing a string or a pattern with another string. This is one of perls particular strengths, giving rise to its reputation for being a "text oriented" programming language

delete all occurences of the new-line character in a string

 $text =~ tr/\n//;

Perl Special Variables ‹↑›

www: $_
this contains the current line when looping through the standard input, or else the current element from any list. use "chomp" to remove the newline from this when necessary @@

Using Files ‹↑›

test if a file exists and exit if it does not

 if (!-e 'index.txt') { die "the file doesnt exist"; }

test if a file is actually a folder

 if (-d "work") { print "'work' is a folder\n" }

test if a file is a plain text file

 if (-T "index.txt") { print "index.txt is a text file\n" }

test if a file can be executed

 if (-x "") { print "index.txt is executable\n" }
 -x "" and print "index.txt is executable\n" the same

Opening Files For Reading Or Writing ‹↑›

attempt to open a file for reading, and, if not, show the error message

 open(FILE, 'index.txt') or die "Can’t open file: $!";  "$!" has the error

Write To A File ‹↑›

write a string to a file (any previous file contents are destroyed)

$s="green tree"; open(F, ">index.txt") || die "Could not open the file for writing!"; print F "$s"; close F; ,,,

Copying Files ‹↑›

copy a file to another name or exit if it is not possible

    use File::Copy; my $f = "list";
    copy($f, "$f.1") || die "could not copy file $f, because: $!";

Creating A Temporary File ‹↑›

create a temporary file, without ever knowing its name

    use IO::File;
    $fh = IO::File->new_tmpfile() or die "Couldnt make the temp file: $!";

another way

 use File::Temp

File Globbing ‹↑›

'file globbing' refers to expanding a wildcard character (such as '*' or '?') into a list of valid file names for the local computer.

display files with a '.txt' extension in the current folder

 perl -e 'for (glob("*.txt")) { print $_."\n"}'
 perl -e 'foreach (glob("*.txt")) { print $_."\n"}'     the same
 perl -e 'foreach $f (glob("*.txt")) { print $f."\n"}'  the same again

Using Folders Or Directories ‹↑›

print the directory part of a file name

    use File::Basename; 
    print dirname("/home/username/index.txt");

Perl Standard Functions ‹↑›

 perldoc perlfunc

Writing Perl Functions ‹↑›

append the result of a function to a scalar variable

 $sOutput .= listData($configFile, $siteRoot, '');

Importing Other Files ‹↑›

import the file "" which contains a function and is in the "lib" folder

 require 'lib/';

Text Files ‹↑›

change aaa for bbb and print each line

 perl -p -e 's/aaa/bbb/' test.txt    the file is not changed

change aaa for bbb and print each line

 perl -pi -e 's/aaa/bbb/' test.txt    the file IS changed

replace the word big with small in .txt files and backup to .bak

 perl -p -i.bak -e 's/\bbig\b/small/g' *.txt

recursive replacement of text in this and subdirectories

 perl -p -i.bak -e 's/\bbig\b/small/g' $(find ./ -name "*.txt")
 perl -p -i.bak -e 's/\bbig\b/small/g' $(grep -ril text *)

insert one line in a text file

 use Tie::File


perl documentation for regular expressions
man perlrequick - a quick introduction
man perlretut - more indepth look
man perlreref - a quick reference for perl regular expressions
man perlre - a complete reference