=head1 NAME interperl - Intermediate Perl for Sysadmins =head1 DESCRIPTION This is an intermediate level training document on Perl that describes perl constructs and algorithms to improve programmer efficiency. =head1 Introduction Perl is a free programming language created by Larry Wall and maintained by a global group of thousands of open source volunteers. Perl has been called I and will likely forever be so. In the words of it's creator, perl makes I. It is a rich language that helps you program all manners of sysadmin tasks quickly, scale/grow them and maintain them well though their lifetime. =head2 Objective The goal of this document is to introduce some intermediate level concepts in perl for working system administrators. By practicing the concepts described in this document, you will be able to B, and would be I. NOTE: B. =head2 Organization of this document This material assumes the reader is familiar with I which is available in the companion document B. Readers are I. Writers are assumed to have atleast 1 year of programming experience in atleast 2 different languages, one of which is an intrepreted language like B or B. =head2 Additional pointers for learning This document is not a substitute for programming, nor does it substitute the documentation that comes with perl! The more you code, the better you can program. What is not obvious is that the more you read, the smaller your programs need to be to get the same work done. It is B that you read the perl documentation available on your system. At the very least, you should try to read all the manual pages mentioned in this document. Reasonably competent system administrators can implement 90% of their regular tasks with minor modifications to the program snippets available in the core documentation that comes with perl. L<"Resources"> section gives you details on where to look for complete and authoritative information. =head2 Why Perl? Perl is designed to be like C: flexible and powerful enough to manipulate the machine's capabilities directly. Perl is also designed to be like B: I. =head2 TMTOWTDI Most programming languages have a minimalist set of constructs (succumbing to I of design). There is usually I to do a particular task in such languages. Perl differs from such languages. It has been designed with redundancy in mind: multiple constructs abound that do almost similar things. If programming in other languages can be equated to a walk through a maze with orthogonal turns, programming in perl feels more like a walk through the grass in a park. This has led to the perl motto B, abbreviated to TMTOWTDI or I. =head2 Extensible The most current version of perl is version 5.8.3. Version 5 has been built with extensibility in mind. This has resulted in the largest collection of perl extensions (called Modules) and a worldwide group of volunteers who actively maintain the comprehensive perl archive of networks CPAN (http://www.cpan.org/). =head2 History The first version of perl was released in 1987. After successive refinements version 4 of perl was released in 1991, which also coincided with the first release of I book, I. Perl version 4 quickly became very popular. As many people started using perl for more than a few simple tasks, the limitations of the language made it difficult for people to add new features. To prevent perl from forking into many versions, a complete rewrite of perl was done and released as version 5. Perl version 5 was more extensible than version 4. It contained large-scale-programming features, added completely new features like lexical variables and closures, re-hauled regular expression engine, references, and made it possible to pretty much extend perl infinitely. Version 5 supports more operating systems; the standard distribution comes with a clean abstraction for database support (DBI), a Tk port to perl (Perl/Tk) and boasts a Win32 port for PCs running Microsoft operating systems (this port has since been integrated into the core perl distribution in source form). For the most current updates and feature list for perl, you should see the distribution, which is always available at http://www.perl.com/CPAN-local/ =head1 Perl Data Types Perl provides you with three basic, but powerful data types. Unlike most languages, perl allows you to grow/shrink them dynamically without you ever having to worry about memory allocation/de-allocation. Perl does it all for you. The three fundamental data types in perl are called I, I and I. =head2 Scalars A scalar is the fundamental data type in perl. A scalar can hold a single value. This value may be a string, number, a file-handle, a typeglob, or a reference to another perl data type. Here is a translation table from C to perl: int,float,double => scalar (numeric interpolation) char * => scalar (string interpolation) file *fp => filehandle (*STDIN) symbol table => typeglob (*FOO{THING} ) &(struct foo)ptr => reference to ANY{THING} Here are some examples: $a = 'this'; print "String = $a\n"; #stores 'this' in $a $answer = 42; print "Number = $answer\n"; $ref = \$a; print "Reference= ",ref($ref)," => $ref \n"; $r = *STDIN; print "Typeglob = $r\n"; # #prints: # #------------------- output start--- # String = this # Number = 42 # Reference= SCALAR => SCALAR(0x90c8170) # Typeglob = *main::STDIN # #------------------- output end --- You can build a scalar from other scalars through numeric and string operations. The following examples show interpolation at work: $x = "2.00"; $y = 4; $z = "abc"; print "Numeric interpolation: $x+$y gives =>", ($x+$y), "\n"; print "String interpolation : \$z gives $z\n"; print "Concatenation of $x . $y gives ", ($x . $y ) , "\n"; print "String multiplication \$z x $y, gives [", $z x $y, "]\n"; # #prints: # #------------------- output start--- # Numeric interpolation: 2.00+4 gives =>6 # String interpolation : $z gives abc # Concatenation of 2.00 . 4 gives 2.004 # String multiplication $z x 4, gives [abcabcabcabc] # #------------------- output end --- The `+' operator is the familiar numeric addition. The `.' operator is the string I operator that concatenates it's left and right operands and returns the result. As you can see, scalar values can be built dynamically, and can grow or shrink at programmer's will. =head2 Lists and Arrays B. When a list of values need to be stored somewhere, you will usually use B. Thus, an array is a list each of whose element really contains a B value. This is the most important thing you need to know about lists. As with scalars, lists can be built dynamically, and their size can be increased or decreased by adding, deleting or splicing elements at will. Arrays act like the English word `these'. You prefix an array with the B<@> character. However, to get the I element of an array, you need to derefence the array with the B. Typically, you will store scalar B in an array, so you would want to do something like this: # #direct definition, literal list # # $"=", "; @replicators = ('rna-strand', 'dna-strand', 'exon', 'intron', 'prion'); print "Replicators: @replicators\n"; # # #Assign to an element # $description[0] = 'Kingdom'; print "Description[0] = $description[0]\n"; # #Push multiple elements dynamically (runtime) # push @description, split(/:/, 'Phylum:Order:Class:Family:Genus:Species'); print "Description now: @description\n"; # #Split words using the quoting operator 'qw' # @woman = qw(Animalia Chordata Mammalia Primates Hominidae Homo Sapiens); # #PRINT in 3 ways: # # print "using a 'for' block and 'print'\n"; print "$_\n" for @replicators; # print "\nusing a 'for' iterator and 'printf':\n"; for (0..$#description) { printf "Linneaus says, %-20s => %s\n", $description[$_], $woman[$_]; } # print "\nUsing 'map', 'sprintf' to transform a list:\n"; print map { sprintf("%02d %s\n", $_, $woman[$_]) } 0..$#woman; #prints: # #------------------- output start--- # Replicators: rna-strand, dna-strand, exon, intron, prion # Description[0] = Kingdom # Description now: Kingdom, Phylum, Order, Class, Family, Genus, Species # using a 'for' block and 'print' # rna-strand # dna-strand # exon # intron # prion # # using a 'for' iterator and 'printf': # Linneaus says, Kingdom => Animalia # Linneaus says, Phylum => Chordata # Linneaus says, Order => Mammalia # Linneaus says, Class => Primates # Linneaus says, Family => Hominidae # Linneaus says, Genus => Homo # Linneaus says, Species => Sapiens # # Using 'map', 'sprintf' to transform a list: # 00 Animalia # 01 Chordata # 02 Mammalia # 03 Primates # 04 Hominidae # 05 Homo # 06 Sapiens # #------------------- output end --- There are various other operations you can perform on arrays. here are some examples: # $"=", "; # push @a, 1, 'two'; print "A = (@a)\n"; pop @a; print "A is now popped to: (@a)\n"; unshift @a, 'two'; print "A unshifted to: (@a)\n"; shift @a; print "after a Shift, A is: (@a)\n"; # #prints: # #------------------- output start--- # A = (1, two) # A is now popped to: (1) # A unshifted to: (two, 1) # after a Shift, A is: (1) # #------------------- output end --- =head2 Hashes The final perl data structure we will see is a hash. A hash is very much like a list, but it is indexed by strings (a list is indexed by number). A hash is like a database indexed by a single key field. Hashes are initialized by specifying the key and value in pairs. For example: %colors = ( 'red' => '#FF0000', 'green' => '#00FF00'); %passwd = ( 'root' => 'ez2Krack', 'mysql' => 'se1ect!'); Hash keys are strings and hash values are scalars, so you can refer to them in any place where you would need a scalar value. The individual key is enclosed within curly braces to specify that we are referring to a hash. Here is an example of adding another element to one of the above hashes by using a value stored in it: $colors{'blue'} = $colors{'red'}; Here is how it works. C<%colors> is the hash. It's name is I. The key for which we want to create a value is I. So the actual value is at key 'blue', which is a scalar: Key => 'blue' => 'blue' hash => curly braces => {'blue'} Scalar value => $ => $colors{'blue'} Here's another: print("Root password is too $passwd{root}\n"); =head1 Refresher on Operations on perl variables Perl provides many basic operations to manipulate variables. However these operations are I than in most other languages, there are groups of operations that do I, so you have a choice of programming styles. =head2 Scalar Ops: length, substr, tr, s, chomp, lc, uc, int, sprintf Try each of the below statements and see if the result matches with the comments (You can ignore anything followed by a '#' because those are comments): $dozens = int( 97/12 ); # gets 8 print "97/12 = $dozens\n"; $_ = 'A single sentence.'; $l = length($_); print "Length of '$_' = ($l)\n"; $is = substr($_, 9, 4); #$is is now 'is' print "Substr('$_',9,4) = $is\n"; print "\$_ is now: '$_'\n"; $_ =~ tr/st/tp/; #$_ is now 'A tingle tenpence.'; print "\$_ after tr/st/tp/: '$_'\n"; $_ =~ s/t/s/; #$_ is now 'A single tenpence.'; print "\$_ after another s/t/s: '$_'\n"; print "All upper case, '$_' is ", uc($_), "\n"; $pi = sprintf("%.12f", atan2(1, 1)*4); print "PI = $pi\n"; #prints: # #------------------- output start--- # 97/12 = 8 # Length of 'A single sentence.' = (18) # Substr('A single sentence.',9,4) = sent # $_ is now: 'A single sentence.' # $_ after tr/st/tp/: 'A tingle tenpence.' # $_ after another s/t/s: 'A single tenpence.' # All upper case, 'A single tenpence.' is A SINGLE TENPENCE. # PI = 3.141592653590 # #------------------- output end --- =head2 List Ops: push, pop, shift, unshift, sort, splice @a = (1, 2, 3); print "A is: @a\n"; $last = pop @a; print "last element of A is: $last\n"; # @sorted = sort('jack', 'jill', 'fred', 'barney'); print "@sorted\n"; #prints `barney fred jack jill' # # splice @sorted, 2, 2, 'wilma', 'betty'; print "Spliced: @sorted\n"; #prints `barney fred wilma betty' # #prints: # #------------------- output start--- # A is: 1 2 3 # last element of A is: 3 # barney fred jack jill # Spliced: barney fred wilma betty # #------------------- output end --- =head2 Hashes: keys, values, each %h = ( 'linux' => 'Linus Benedict Torvalds', 'perl' => 'Larry Wall', 'hurd' => 'Richard M. Stallman', 'unix' => 'Dennis and Ken', 'TAOCP' => 'Don Knuth', ); # @software = keys %h; @authors = values %h; # while ( ($k, $v) = each %h) { printf "%-10s was the brainchild of $v\n", $k; } # #prints: # #------------------- output start--- # perl was the brainchild of Larry Wall # TAOCP was the brainchild of Don Knuth # hurd was the brainchild of Richard M. Stallman # unix was the brainchild of Dennis and Ken # linux was the brainchild of Linus Benedict Torvalds # #------------------- output end --- =head1 Perl Expressions, Statements and Context =head2 Expressions form Statements Everything in perl is an I. An expression is a basic unit of program in perl that returns a result. For example, the C statement in perl is actually an expression that returns a value. $result = print("this is the stament that prints 'Foo'\n"); print "Result of previous stmt = $result\n"; # #prints: # #------------------- output start--- # this is the stament that prints 'Foo' # Result of previous stmt = 1 # #------------------- output end --- A perl statement is merely an expression evaluated for side effects. Expressions can not only B results, but can also be B under appropriate conditions. When the return value of an expression is merely used to assign it to something else, it is said to be used as an B. In contrast, when you assign B an expression, it is said to be used in an B context. Some perl functions/operations can act as I which is nice. $_ = "ABC\n"; $\="\n"; print substr($_,1,1); #prints 'B' substr($_, 1, 1) = 'C'; print; #prints 'ACC' # #prints: # #------------------- output start--- # B # ACC # # #------------------- output end --- Expressions can also return different things based on the B in which they are called! The two major types of context are described below. We will not discuss the I context which is a special case. =head2 Scalar context A scalar context expects/returns a single scalar value. If you use an expression in a scalar context, the expression I it's return value(s) are coerced into a scalar. For example: $count = @lines; Here, @lines is an expression that returns a list of all elements contained in the array @lines. This expression is forced into a scalar context by the assignment statement. In a scalar context, this gives the number of elements of the array @lines. Thus, $count will really contain the I in the array @lines. =head2 List context A list context expects/returns a list of scalars. If you use an expression in a list context, the expression I it's return value(s) is/are coerced into a list. For example: @lines = ; Here, @lines provides a list context to the expression . This in turn makes the expression slurp the entire STDIN (until an eof or CTRL-Z) and return it as a list of lines. Thus, if you were to type 10 lines in the terminal followed by a CTRL-D after this statement, @lines will contain 10 elements, each of which will contain the respective line you entered. This works for lists in general, but there is a special case of a B that you should be aware of: A literal list appears like a "C comma operator" in a scalar context. Here is an example to illustrate this important distinction: @a = (12, 0, 32, -23); $b = @a; print "b = $b\n"; $c = (12, 0, 32, -23); print "c = $c\n"; #this prints: #b = 4 #c = -23 For more on context, see L. =head1 Loops in perl: for, foreach, while Most common tasks are repetitive. Like most languages, perl allows you to repeat a set of statements using I constructs. The two most common looping constructs are I and I. =head2 Example for `foreach' #perl style foreach my $number (1..10) { print "foreach $number\n"; } =head2 Example loop using `for' #C style for (my $number = 1; $number <= 10; $number++ ) { print "for $number\n"; } =head2 Example loop using `while' my $number = 1; while ( $number <= 10 ) { print "while $number\n"; $number++; } The looping constructs are actualy far more versatile. For the full details, you should start at L. For more on this, see L. =head1 Perl builtin variables Perl has builtin variables that take on certain B<`sensible'> values at runtime. As we noted before, I. In the absence of an explicit assignment, some of the expressions take default arguments. I. In other cases, changing the settings of some internal variables will make the succeeding lines in the perl program snippet behave differently (like B or I). Here are some examples without explanations: =head2 @ARGV #!/usr/bin/perl -w use strict; my $arg; foreach (@ARGV) { $arg++; print "Argument $arg: $_\n"; } =head2 %ENV while (($key, $value) = each %ENV) { print "$key=$value\n"; } =head2 @INC This is the include path for perl libraries. foreach (@INC) { print "$_\n"; } # $file = 'CPAN.pm'; foreach (@INC) { print "Found $file under $_/$file\n" if ( -f "$_/$file"); } # #prints: # #------------------- output start--- # Found CPAN.pm under /usr/lib/perl5/5.8.3/CPAN.pm # #------------------- output end --- You can override this variable within your program. For example, if you have installed the latest cool whiz-bang version of Foo::Bar under your $HOME/lib directory, here is what you would do: use lib '/my/home/dir/lib'; use Foo::Bar; { #...whatever... } =head2 $_ = default input and pattern search space Example: while ( ) { split; } Is the same as the more elaborate: while ( defined($var = ) ) { @_ = split " ", $var; } =head2 @_ = default arguments for subroutines, default destination of 'split' As explained in the example above, the default destination of a split is @_. In the context of a subroutine call, @_ contains all the arguments to the subroutine. Note that perl subroutines can have a variable number of arguments on I invocation. @_ will automatically be sized accordingly. Since B<@_> is a global variable, the I value of @_ is restored as soon as the subroutine call ends! =head2 $. $/ $\ : File I/O counter, record separators When you use the EE operator to read data from a file, perl automatically stores the I in a variable named B<$.>. How does perl know where a line ends and the next one begins? Well, that is what the record separator variable, B<$/>, is for! As with most perl predefined variables, this takes on a default value and $/ defaults to "\n". Here is a way to read in a whole file to a single scalar, if you have lots of memory to burn: $/ = ''; open(INPUT, 'tail -5 /var/log/messages|') || die "/var/log/messages: $!\n"; $slurp = ; close INPUT; print $slurp; # #prints: # #------------------- output start--- # May 9 19:26:42 mithya.sarvam.com root: Test 1 # May 9 19:26:50 mithya.sarvam.com root: Test 2 # May 9 19:26:52 mithya.sarvam.com root: Test 3 # May 9 19:26:54 mithya.sarvam.com root: Test 4 # May 9 19:26:57 mithya.sarvam.com root: Test 5 # #------------------- output end --- Similarly, every B statement will tack on the value of the builtin variable B< $\ > to every line/record you write. This variable is null by default, but if you want to, you can change this. See the B<-p> and <-l> variables in L for more usage information. =head2 $0, $$ : program name, PID Type the following example into a test program and run: #!/usr/bin/perl -w print "I am called as $0\n"; print "My PID is $$\n"; # #prints: # #------------------- output start--- # I am called as /tmp/codeliver.out # My PID is 27485 # #------------------- output end --- =head2 $! : O/S Error string or Errno for (1..10) { $! = $_; print "$_ => $!\n"; } print STDERR "File /etc/nosuchfile: $!\n" unless -f '/etc/nosuchfile'; # #prints: # #------------------- output start--- # 1 => Operation not permitted # 2 => No such file or directory # 3 => No such process # 4 => Interrupted system call # 5 => Input/output error # 6 => No such device or address # 7 => Argument list too long # 8 => Exec format error # 9 => Bad file descriptor # 10 => No child processes # File /etc/nosuchfile: No such file or directory # #------------------- output end --- =head2 $?, $@ - Errors from child/pipe/eval Example: `/etc/nowhere/hostname`; print "\$? = $?\n"; eval qq{open(F, '/tmp/nosuchfile') or die "nosuchfile: $!"}; print "\$@ =\n$@\n"; # #prints: # #------------------- output start--- # $? = -1 # $@ = # nosuchfile: No such file or directory at (eval 1) line 1. # # #------------------- output end --- =head2 $<, $>, $(, $) : real, effective uid/gid print "Real: $<, Effective: $>\n"; See L for more information. =head1 Commonly used Operators in perl =head2 Logical Operators Logical operators return true or false. Perl has all standard logical operators. However, the meaning of true and false is different in perl, because perl considers strings and numbers to be the same data-type: Scalar. Here is a quick overview of truth as it applies to perl scalars: The empty string "" is false. Any string that evaluates to "0" is false. Any number that evaluates to 0 is false. Any I value is false. All else is true. Sometimes, this is surprising: print "Yes, string '0.0' is ''\n" if ( "0.0" == ''); print "What, string '0.0' is 'true'??\n" if ( "0.0" ); In line 1, we see that the string "0.0" is converted to 0 in the numeric context of the I<==> operator. The empty string on the right side is similarly converted into false. However, in line 2, the string "0.0" evaluates to I according to the rules. Thus, the print statement does get executed. Logical operators available in perl are I<&&>, I<||> and I. The logical I<&&> and I<||> operators are I operators, like in C. This means that the second operand is evaluated only when it's necessary. Here are some examples: $home = $ENV{HOME} || (getpwuid($<))[7] || die "No home directory!\n"; print "Your machine is wide open!\n" if ( $> && $< && -r "/etc/shadow"); For more on this, see L. =head2 Binding operators When you need to match a string with a pattern or make changes to it using a I match and replace, you use the I operator, B<=~>. To negate the logical sense of a match, you use the B operator. Here are some examples: for my $host (qw(www.google.com samba.net.au)) { if ( $host =~ /\./ ) { print "$host seems to be fully qualified!\n"; if ($host !~ /\.(com|org|edu|mil|gov|net)$/ ) { $country = $host; $country =~ s#.*\.##; #remove everything except the TLD marker print "It's country of origin is: $country\n"; } else { ($tld = $host) =~ s/.+\.//; print "Host is a canonical TLD [.$tld]\n"; } } } =head2 Additional logical operators not found in C In addition to && and || for logical operations, perl provides B and Some new logical operators In addition to && and || for logical operations, perl provides B and B. These behave identically to the I<&&> and I<||> except that they have very low precedence. I determines the order of evaluation within a single statement. Here is an example where not knowing the precedence might bite you (in fact, the perl and/or operators were designed just so that people don't make this mistake). Perl allows you to call functions without using parentheses around the arguments. If you need to open a file, here is how you'd do it I parentheses around the arguments, without checking the return values: open(FOO, '/etc/passwd'); This can also be written conveniently as: open FOO, '/etc/passwd'; These two function calls work exactly the same way. Now, if you need to add some error checking of the return value of the I call, you would do something like this: open(FOO, 'bar') || die "bar: $!\n"; The equivalent open FOO, 'bar' || die "bar: $!\n"; parses as: open(FOO, 'bar' || die "bar: $!\n"); This is not what we want. In this situation the I operator comes to the rescue: Thus, it is better written as: open FOO, 'bar' or die "bar: $!\n"; =head2 Variables and Quoting operators Variable names can contain B. The first character should not be a digit. To store a value within a variable, you B the value if it is a string, or use a B. In addition to the standard quoting characters, perl provides additional syntax to allow you to simplify creation of strings with embedded quotes. These are the B, B, B and B operators. These operators are flexible in that you can use I character as the quoting character. For example, instead of the curly braces, you can use the B<#> character as quoting character: $something = q#Single quoted#; $nother = qq#Not '$something'#; $crazy = 'Please don\'t use \'\' within this string'; $ok = q{Please don't use '' within this string}; $foo = "Mail us"; $foobetter = qq{Mail us}; for (qw(something nother crazy ok foo foobetter)) { eval qq{print "$_ = \$$_\n";} } $ip_patt = qr{^\d+\.\d+\.\d+\.\d+}; print "127.0.0.1 matches $ip_patt!\n" if ( '127.0.0.1' =~ /$ip_patt/ ); For more on perl operators, see L. =head2 I/O Operations: Standard Filehandles Following the Unix convention, perl provides three default Filehandles that are direct analogues to C: I, I and I. In the absence of an explicit Filehandle, the magical spaceship operator (EE) automatically reads from STDIN. In the absence of an explicit Filehandle your I statements automatically print to STDOUT (You override this by using the L