Introduction to Perl

Perl, the Practical Extraction and Report Language is an extremely flexible scripting language that you can use to do extraordinary things, like analyzing, taking apart, and transforming text in a host of ways, or performing complex transactions over computer networks, or bringing the web pages alive, or producing prototypes of complex systems in a fraction of the time it would take using another computer language.

Youll find a large collection of excellent Perl tutorials on the Web at:

http://www.programmingtutorials.com/

The following tutorial requires that you have a basic understanding of the Unix operating system, in particular the ability to use a Unix text editor, like VI or PICO.

Command-Line Execution

Perl can be executed at a command prompt.
  1. Try the following:
        perl -e 'print "Hello World\n";'
        
  2. Another one to try is:
        who | perl -n -e 'print if (/ist/);'
        

Your First Perl Script

To write your first Perl script you need to invoke a text editor, pico or vi.
  1. Enter the following lines in your text editor.
    #!/usr/bin/perl
    print "Hello World\n";
    
  2. Now save this file as HelloWorld
  3. Before you can execute this program, you will need to turn on execute permissions:
    chmod +x HelloWorld
  4. To run the file simply enter:
    ./HelloWorld
The line "#!/usr/bin/perl" tells the operating system that this is a Perl program and that the Perl command is in the /usr/bin directory. This line is required at the beginning of all Perl programs that you will write. This line may very depending on the version of Perl you want to use.

Variables

Perl has two types of variables: scalers and arrays. Scalars hold a single value, either a string or a number and are identified by the $. Arrays hold multiple values; theyre just lists of scalars and are identified by an @ or a %.

Scalars

Scalars can take on numeric values or strings. Perl does not require type declarations, making it convenient for small applications or prototyping.
  1. Lets take a simple program (priority.pl) like the following:
    #!/usr/bin/perl
    $priority = 9;
    print "Priority is $priority\n";
    
    Note that variables are interpolated inside double quotes; not so inside single quotes.
  2. Create and run the above program. Change $priority to equal High and rerun the program.

Arrays:

Arrays are just lists of scalars and is denoted by an @ or %.
  1. Create a program called food.pl that defines a simple array as follows:
    @food = ("apples", "oranges", "fish", "meat");
  2. We access the array by using indices starting at 0; square brackets are used to specify the index. Add the following statement to your program
    print "$food[3] \n";
    Notice the use of $ rather than @. $food[3] refers to a single element of @food and is therefore a scalar.
  3. An associative array (called a hash) allows you to associate key/value pairs. For example, try the following (fruit.pl):
    %colour = ( 'apple', 'red', 'banana', 'yellow');
    print $colour{'apple'};

The $_ Special Variable

Perl has a special variable called $_. This is a scalar variable that is a default variable for many Perl operations and is very heavily used. It can be explicitly defined or can be implied. We can see this more readily in our next section on substitution and translation.

Operators

  1. Heres a list of the numeric operators. Try a few examples:
    $a = 1 + 2     #Add 1 and 2 and store in $a
    $a = 3  4     #Subtract 4 from 3 and store in $a
    $a = 5 * 6     #Multiply 5 and 6 and  store in $a
    $a = 7 / 8     #Divide & by 8 and store in $a
    $a = 9 ** 10   #Nine to the power of 10
    $a = 5 % 2     #Remainder of 5 divided by 
    ++$a           #Increment $a and then return it
    $a++           #Return $a and then increment it
    --$a           #Decrement $a and then return it
    $a--           #Return $a and then decrement it
    $a = $b        #Assign $b to $a
    $a += $b       #Add $b to $a
    $a -= $b       #Subtract $b from $a
    
  2. Try a few examples using the string operators
    $a = $b . $c   #Concatenate $b and $c
    $a = $b x $c   #$b repeated $c times
    $a .= $b       #Append $b onto $a
    

Control Structures

Perl supports the standard for, if, foreach and while structures that we see in languages like C, C++, and Pascal. The structure name is followed by an expression or set of conditions contained in ( ) brackets and finally the set of statements we want to execute contained in { } brackets.
  1. Create a simple Perl script called forloop.pl containing the following lines:
    #!/usr/bin/perl
    @colours = ('red', 'green', 'blue');
    for ($i = 0; $i <= $#colours; $i++) {
        print "$colours[$i]\n";
    }
    
  2. Heres a foreach statement (foreach.pl) that loops through the elements of a hash
    #!/usr/bin/perl
    %fruitColours = ('apple','red','banana','yellow','line','green');
    foreach $fruit ( keys(%fruitColours) ) {
       print "$fruit $fruitColours{$fruit}\n";
    }
    
  3. Extend the program to include an if statement (foreachv2.pl):
    #!/usr/bin/perl
    
    %fruitColours = ('apple','red','banana','yellow','line','green');
    foreach $fruit ( keys(%fruitColours) ) {
       print "$fruit $fruitColours{$fruit}";
       if ( $fruitColours{$fruit} eq 'red' ) {
          print "- it\'s my favorite\!\n";
          last;
       } elsif ( $fruitColours{$fruit} eq 'green' ) {
          print "- not my favorite\n";
       } else {
          print "- I dislike it\n";
       }
    }
    
    note the use of last to break out of a loop.

Comparisons

  1. Values may be compared for numeric equality using the following relational operators:
    == equal and != not equal
    > greater than and >= greater than or equal
    < less than and <= less than or equal
    
  2. Try the following (comp1.pl):
    $x=10;
    print "\$x > 9 \n" if ( $x > 9 ); 
    
  3. Strings may be compared using the following:
    eq equal and ne not equal to
    gt greater than and lt less than
    
    Add the following to your test program
    print "\$x lt 9\n" if ( $x lt 9 ); 
    
  4. A common mistake is to use the single = rather than the double == Try the following:
    print "\$x=11\n" if ( $x = 11 );
    
    In an expression, $x was successfully assigned the value 11. This success was signaled by a true condition.

Input/Output

  1. Perl has a line-input operator <FILEHANDLE> that returns input lines, by default, to the standard variable $_ (FILEHANDLE defaults to STDIN). Create the following program called whoist.pl:
    #!/usr/bin/perl
    while(<>) {
          if(/ist/) {
              print $_;
          }
    };
    
  2. Test your program with the output from the who command:
    Who |./whoist.pl 
    
  3. We can redirect the output from print using the open statement. Try the following example (whoistv2.pl):
    #!/usr/bin/perl
    open(IST, ">istlogons");
    while(<>) {
           if(/ist/) {
              print IST $_;
           };
    }
    close IST;
    
    If we had used >> instead of a single > we would be opening a file for appending.
  4. The open statement may also be used for input. Try the following (whoistv3.pl):
    #!/usr/bin/perl
    open(IST, "<istlogons") or die "Cannot open input file istlogons";
    @input=<IST>;
    print @input;
    close IST;
    
    Note the use of the or in the open statement. If the open fails, the or will force the evaluation of the right hand side of the statement.
  5. Another way to input data from a file is to use the Unix cat command. e.g.
    @input = `cat istlogons`;
    
    Change your program to use this technique.

Regular Expressions

  1. Perl has a powerful and complex collection of pattern matching capabilities. Try the following examples (patterns.pl):
    #!/usr/bin/perl
    $name = "SamSlate\@ist.uwaterloo.ca";
    
    # simple pattern
    if ( $name =~ /Sam/ ) { print "Name contains Sam\n"};
    
    Note that the pattern operator =~ assigns the target to the search.
  2. Add the following lines to your program.
    # case insensitive
    if ( $name =~ /sam/i ) { print "Name contains sam\n"};
    
    # alternative characters
    if ( $name =~ /[Ss]am/i ) { print "Name contains Sam or sam\n"};
    
    # starts with
    if ( $name =~ /^S/ ) { print "Name begins with an S\n"};
    
    # return matching value
    if ( $name !~ /^S/ ) { print "Name does not begin with an S\n"};
    
    # return matching value 
    # Brackets may be used to return values that match the enclosed patterns.
    if ( $name =~ /(.*)\@(.*)/ ) { print "Name is $1 domain $2\n"};
    
    # Note the dot . matches any character and the * repeats the match 0 or 
    # more times. Use + for 1 or more times. 
    # We can also match alternate patterns using the or operator |. 
    # If $name starts with either Sam or Jack, then:
    
    if ( $name =~ /(SamSlate|Jack.*)\@(.*)/ ) { print "Name is $1 domain $2\n"};
    

Substitution and Translation

The substitution operation is introduced with each s. The syntax of the substitution operation is as follows: s/target/replacement/mod; Now lets look at an example that you can try.
  1. Create the following program (sub.pl):
    #!/usr/bin/perl
    $_ =  "pattern matching is a black art";
    s/black art/confusing/;
    print "$_ \n";
    
    Note that the substitution operator is operating on the default variable $_.
  2. Now replace the $_ variable with the $sentence variable. Note that since we are not using the default variable, we need to use the pattern match operator =~ to associate the variable with the pattern (subv2.pl).
    #!/usr/bin/perl
    $sentence =  "pattern matching is a black art";
    $sentence =~ s/black art/confusing/;
    print "$sentence \n"; 
    
  3. If we want to change all instances of the pattern, we simply add the global g option to the substitution operation. For example, try (subv3.pl)
    $sentence =~ s/at/rat/g;
    
  4. There is also the i option for ignore case, use this if you want to change any instance of the pattern no matter if it is upper or lower case. Give this a try.
  5. Translation is a specialized form of the substitution operation. Translation consists of a systematic replacement of one character by another, wherever it occurs. Lets take our previous example program and change all the characters to lowercast.
    #!/usr/bin/perl
    $_ =  "Pattern matching is a Black Art";
    tr/A-Z/a-z/;
    print "$_ \n";
    
    Run it and see what happens.

Functions

  1. Youll find a large collection of functions described in the Perl reference manuals. The following uses chop to remove the last character of the input line (newline) and split to separate the comma-separated strings of each one (functions.pl).
    while (<>) {
      chop;   # avoid \n on last field
      @array = split(/,/);
             ...
      }
    

Subroutines

  1. Create a program to pick out the first word of each line (subroutines.pl):
    #!/usr/bin/perl
    while ( <> ) {
       &GetWord;
       print "$firstword \n";
    }
    sub GetWord {
        ($firstword, $otherwords) = split(/ /,$_);
        return;
    } 
    

Web Form Processing with Perl

Now that we have a basic overview of Perl, well start using it to process web forms. Perl is well suited for web form processing and has become the language of choice.
  1. Start by creating, and changing your directory to, public_html. This is the standard location for a personal web space.
    mkdir public_html
    cd public_html 
    
  2. Create a simple form called firstform.pl to welcome the user
    <html>
    <body>
    <h1>This is My Test Form</h1>
    <FORM METHOD=GET ACTION="cgi-bin/firstform.pl">
    What is your name?
    <INPUT TYPE="text" NAME="name"><BR>
    <INPUT TYPE="submit" VALUE="Submit">
    <INPUT TYPE="reset" VALUE="Reset">
    </FORM>
    </body>
    </html>
    
  3. Make sure the file is readable by others
    chmod o+x firstform.html 
    
  4. Note the ACTION, specifying the name of the program to be executed when the user presses SUBMIT. Test your form, realizing that the SUBMIT button will not yet be working.
  5. Now create a subdirectory called cgi-bin. This is the standard location of form processing scripts. Make sure the directory is executable by others.
    mkdir cgi-bin
    chmod o+x cgi-bin 
    
  6. Create a file called .htaccess containing the following to direct the Web server to execute, rather than simply displaying the files contained in the cgi-bin directory.
    SetHandler cgi-script 
    
  7. Create a script called firstform.pl in the cgi-bin directory.
    #!/usr/bin/perl
    push (@INC, "/software/www_server/data/cgi-bin");
    require "cgi-lib.pl";
    &ReadParse(*input);
    print "Content-type: text/html\n\n";
    print "Hello, ", $input{"name"}, "\n"; 
    
  8. This program contains some very mysterious statements that will by present in almost every cgi-perl program.
    push (@INC, "/software/www_server/data/cgi-bin");
    - Add the cgi-bin directory to the set of libraries that Perl searches.
    require "cgi-lib.pl";
    - include the routines found in this library
    &ReadParse(*input);
    - execute the ReadParse routine, creating an array called input of values that have been passed to the program.
    print "Content-type: text/html\n\n";
    - print the standard html header followed by a null line.
    print "Hello, ", $input{"name"}, "\n";
    - print the value associated with the "name" parameter. 
    
  9. Here's a second example called survey.html
    <html>
    <body>
    <h1>Course Survey</h1>
    <FORM METHOD=GET ACTION="cgi-bin/survey.pl">
    What is your name?
    <INPUT TYPE="text" NAME="name"><BR>
    What was your evaluation of the course?
    <SELECT NAME="evaluation">
    <OPTION SELECTED>good
    <OPTION>satisfactory
    <OPTION>poor
    </SELECT><BR>
    <INPUT TYPE="submit" VALUE="Submit">
    <INPUT TYPE="reset" VALUE="Reset">
    </FORM>
    
  10. This survey.pl script sends an email containing the results of the survey.
    #!/usr/bin/perl
    push (@INC, "/software/www_server/data/cgi-bin");
    require "cgi-lib.pl";
    &ReadParse(*input);
    print "Content-type: text/html\n\n";
    $recipient = "ist01\@watserv1";
    $mailprog = '/usr/bin/mail';
    open (MAIL, "|$mailprog $recipient");
    print MAIL "Subject: Survey Using PERL to Process Your Web Form\n";
    print MAIL "---------------------------------------------------\n";
    foreach $var ( keys(%input) ) {
       print MAIL "$var : $input{$var} \n";
    }
    close MAIL;
    
    Notice that the program will pick up whatever fields are defined on the form. Program is available at /u/ist01/public_html/cgi-bin