Getting Started with Perl

1. Introduction

First, read the introductory paragraphs in this Wikipedia article .

Perl is a handy scripting language for various purposes. For instance, a systems administrator may need to parse log files for a popular web server and get a list of IP addresses that have visited a web site. Maybe you just copied-and-pasted a large webpage containing data into a text file and need a way to process the text, extracting key data and saving in a new file. Perl enables these kinds of tasks to be done very quickly when compared to, say, Java.

2. Hello, World!

Like with Python, let's write a Perl script to say hello:

  1. Mount your locker.
  2. Run these commands in your terminal:

         
    $ cd ~/MOUNTED/apcs-locker
    $ mkdir -p final-project/perl
    $ cd final-project/perl
    $ touch HelloWorld.pl
    $ gedit HelloWorld.pl &
         
        
  3. Type the following into the file:

         
    #!/usr/bin/perl
    
    # this is a single-line comment
    
    print "Hello, world!\n";
         
        
  4. (Skip this step if in Rm124.) Now let's make the script executable: chmod a+x HelloWorld.pl

  5. Run the script: ./HelloWorld.pl

3. Variables

Simple variables -- like strings and numerics -- are denoted by a dollar sign ($). Try this out in your HelloWorld.pl script:

   
$a = "hello";
$b = "fred";
print "$a, $b!\n";

$c = $a + $b;
print "$c\n";                 #$a and $b were treated as ints by '+' operator

$d = $a . ", " . $b . "!";    #the proper way to concatenate two strings
print "$d\n";
   
  

Now that you've seen strings, let's try numeric variables:

   
$e = 5;
$f = 16;

#which one's actually doing arithmetic?
print "$e - $f\n";
print $e - $f . "\n";

print $f / $e . "\n";   #support for floating point values!
   
  

4. Arrays

4.1. Array basics

Start a new script called Arrays.pl , and populate it with this starter code:

   
#!/usr/bin/perl

@arr = ( "apples", "bananas", "cookies" );

print $arr[0] . " " . $arr[2] . "\n"; #Java-like way to construct string 
                                      #for printing

print "$arr[0] $arr[2]\n";            #faster!
   
  

As you can see, arrays are indexed starting at zero like the other languages you've seen. Array declarations -- as in @arr = ... -- have the @ symbol before the name of the array. Why the $ when accessing an element, as in $arr[0] ? Because each element in the array is a scalar value (think simple variable). In other words, when pulling a string or number out of an array, use $ , as in $arr[1] .

You say that cookies don't belong in this array?

   
$arr[2] = "cherries";
   
  

The (a) length of an array and the (b) last-used index can be obtained this way:

   
$arrLength = @arr;
print "length of arr is $arrLength\n";
print "last index in arr is $#arr\n";
   
  

4.2. Array walks and for() loops

You have a few ways to iterate through an array to access its elements. Perl lets you simply print the array elements if you wish:

   
print "@arr\n";
   
  

Here's a for-each loop:

   
for $value (@arr) {
    print "$value\n";
}
   
  

And for the traditionalists:

   
for( $i = 0 ; $i <= $#arr ; $i++ ) {
    print "$arr[$i]\n";
}
   
  

Note the stopping condition above -- you want $i to reach and equal the last index, $#arr .

4.3. Adding and removing elements

Using push() adds an element to the end of the list. Try this:

   
print "Before:\t@arr\n";
push(@arr, "donuts");
print "After:\t@arr\n";
   
  

See if you can figure out how pop() and shift() work:

   
# POP EXAMPLE ################################################
@friends = ( "Alice", "Bob", "Charlie", "Dino", "Ed", "Fred" );
print "@friends\n";       #starting state of @friends
$name1 = pop @friends;
print "$name1\n";
print "@friends\n\n\n";

# SHIFT EXAMPLE ##############################################
@enemies = ( "Vinny", "Will", "Xu", "Yan", "Zed" );
print "@enemies\n";       #starting state of @enemies
$name2 = shift @enemies;
print "$name2\n";
print "@enemies\n";
   
  

5. while() Loops

for() loops were covered in the last section. Here's a while loop for you to try:

   
$i = 1;
while($i <= 10) {
    print "$i";

    if ( $i < 10 ) {
            print ", ";
    } else { #$i is 10
            print ".\n";
    }

    $i++;
}
   
  

6. Conditionals: if()-elsif()-else

First, a quick note about booleans: There are no true/True and false/False boolean constants as you've seen in other languages. In Perl, it's common to use 0 for false and any other number (though typically 1 ) for true.

Write a new Perl script called Conditionals.pl , populate as follows, and run:

   
#!/usr/bin/perl

print "Enter an integer (positive, negative, or zero):  ";
$num = <STDIN>;

if ( $num < 0 ) {
    print "NEGATIVE\n";
} elsif ( $num > 0 ) {
    print "POSITIVE\n";
} else {
    print "PROBABLY ZERO\n";
}
   
  

See what happens when you enter a non-integer value, like muffin .

7. Functions

A function (or subroutine as they're sometimes referred) is denoted by the keyword sub . Try this example in a new script:

   
#!/usr/bin/perl

#return the larger of two incoming values
sub max {
    if ( $_[0] > $_[1] ) { 
            return $_[0]; 
    }
    return $_[1];
}

$height = &max(73, 68);
print "The taller man is $height inches tall.\n";
   
  

A few notes here:

Let's make our program from the last section more robust, forcing the user to enter a valid integer. Make your program look like this and run it:

   
#!/usr/bin/perl

sub isInteger {

    $val = $_[0];  #value sent to this function

    if ( $val =~ /^-?\d+$/ ) {  #a regular expression -- coming soon!
            return 1; #true
    } else {
            return 0; #false
    }
}

do {
    print "Enter an integer (positive, negative, or zero):  ";
    $num = <STDIN>;
} while(! &isInteger($num) );

if ( $num < 0 ) {
    print "NEGATIVE\n";
} elsif ( $num > 0 ) {
    print "POSITIVE\n";
} else {
    print "PROBABLY ZERO\n";
}
   
  

8. Live Demos

8.1. Topics covered

At the risk of making this tutorial too long, some topics will be live-demonstrated. Those topics include

8.2. Demonstrations

8.2.1. Motivating RegEx (regular expressions) in Java

Here is a Java program to print strings in an array that appear to have 5-digit zipcodes:

   
public class FindZipCodes {

    public static String[] text = {
        "It was a pleasure meeting you today!",
        "Please contact me if you have any questions:",
        "1001 Cayuga Ave,",
        "San Francisco, CA 94112",
        "\n",
        "xoxoxo,",
        "MF" };

    public static void main(String[] args) {

        //print lines of text having 5 digits in a row:
        for(String str : text) {
            if( hasFiveDigits(str) ) {
                System.out.println( str );
            }
        }

    }

    public static boolean hasFiveDigits(String s) {

        for(int i = 0 ; i + 4 < s.length() ; i++) {
            if( Character.isDigit(s.charAt(i)) &&
                Character.isDigit(s.charAt(i + 1)) &&
                Character.isDigit(s.charAt(i + 2)) &&
                Character.isDigit(s.charAt(i + 3)) &&
                Character.isDigit(s.charAt(i + 4)) ) {
                return true;
            }
        }

        return false;
    }

}
   
  

Regular expressions allow us to describe a pattern like 5-digits-in-a-row much more succinctly:

   
import java.util.regex.Pattern;

public class FindZipCodesRegEx {

    public static String[] text = {
        "It was a pleasure meeting you today!",
        "Please contact me if you have any questions:",
        "1001 Cayuga Ave,",
        "San Francisco, CA 94112",
        "\n",
        "xoxoxo,",
        "MF" };

    public static void main(String[] args) {

        //print lines of text having 5 digits in a row:
        for(String str : text) {
            if( Pattern.matches(".*\\d{5}.*", str) ) {
                System.out.println( str );
            }

            /* alternatively:
            if( Pattern.matches(".*[0-9]{5}.*", str) ) {
                System.out.println( str );
            } 
            */
        }

    }

}
   
  

Perl does regular expressions, but better:

   
#!/usr/bin/perl

@text = ("It was a pleasure meeting you today!",
    "Please contact me if you have any questions:",
    "1001 Cayuga Ave,",
    "San Francisco, CA 94112",
    "\n",
    "xoxoxo,",
    "MF");

for $s (@text) {
    if ( $s =~ /\d\d\d\d\d/ ) {
        print "$s\n";
    }

    # alternatively:
    # if ( $s =~ /\d{5}/ ) {
    #   print "$s\n";
    # }
}
   
  

8.2.2. RegEx examples

Here are some simple regular expression examples. Save the following script into a file that we can run. We'll keep moving the exit statement down as we work.

   
#!/usr/bin/perl

$str1 = "heidi";

if( $str1 =~ /^.....$/ ) {
    print "true\n";
} else {
    print "false\n";
}

exit; #we'll move this statement down after each case we review.

if( $str1 =~ /^..$/ ) {
    print "true\n";
} else {
    print "false\n";
}

if( $str1 =~ /^../ ) {
    print "true\n";
} else {
    print "false\n";
}

if( $str1 =~ /^H/ ) {
    print "true\n";
} else {
    print "false\n";
}

if( $str1 =~ /^h/ ) {
    print "true\n";
} else {
    print "false\n";
}

if( $str1 =~ /^[Hh]/ ) {
    print "true\n";
} else {
    print "false\n";
}

if( $str1 =~ /^[Hh].i/ ) {
    print "true\n";
} else {
    print "false\n";
}

if( $str1 =~ /^[Hh].i/ ) {
    print "true\n";
} else {
    print "false\n";
}

if( $str1 =~ /i$/ ) {
    print "true\n";
} else {
    print "false\n";
}

$str2 = "Mississippi has lots of letters.";

if( $str2 =~ /iss/ ) {
    print "true\n";
} else {
    print "false\n";
}

if( $str2 =~ /iSs/ ) {
    print "true\n";
} else {
    print "false\n";
}

if( $str2 =~ /iSs/i ) {
    print "true\n";
} else {
    print "false\n";
}

if( $str2 =~ /ississ/ ) {
    print "true\n";
} else {
    print "false\n";
}

if( $str2 =~ /z/ ) {
    print "true\n";
} else {
    print "false\n";
}

if( $str2 =~ /z*/ ) {
    print "true\n";
} else {
    print "false\n";
}

if( $str2 =~ /z+/ ) {
    print "true\n";
} else {
    print "false\n";
}

if( $str2 =~ /s{2}/ ) {
    print "true\n";
} else {
    print "false\n";
}

if( $str2 =~ /s{3}/ ) {
    print "true\n";
} else {
    print "false\n";
}

$str3 = "94112-8238";

if( $str3 =~ /\d\d\d\d/ ) {
    print "true\n";
} else {
    print "false\n";
}

if( $str3 =~ /^\d\d\d\d\d$/ ) {
    print "true\n";
} else {
    print "false\n";
}

if( $str3 =~ /^\d\d\d\d\d-/ ) {
    print "true\n";
} else {
    print "false\n";
}

if( $str3 =~ /^\d{5}-/ ) {
    print "true\n";
} else {
    print "false\n";
}
   
  

8.2.3. vi is the best editor ever

The vi editor your teacher uses supports regex; examples:

8.2.4. Writing a script to parse web logs

Web servers are programs running on servers that listen for requests from web browsers (and other programs). Each time you visit a URL, a web server serves up content (HTML, images, etc.) and simultaneously logs the activity.

Let's watch live what the Apache web server does when you visit web pages it's serving up. Click the link when asked : http://apcs02/ . ( tail -f time...)

Our goal is is to write a Perl script that can report to us the various IP addresses that have visited a particular web page on the server.

Notes for your teacher: