First, read the introductory paragraphs in
this Wikipedia article
.
Perl is a handy scripting language for various purposes. For instance, a systems administrator may need to parse log files for a popular web server and get a list of IP addresses that have visited a web site. Maybe you just copied-and-pasted a large webpage containing data into a text file and need a way to process the text, extracting key data and saving in a new file. Perl enables these kinds of tasks to be done very quickly when compared to, say, Java.
Like with Python, let's write a Perl script to say hello:
Run these commands in your terminal:
$ cd ~/MOUNTED/apcs-locker
$ mkdir -p final-project/perl
$ cd final-project/perl
$ touch HelloWorld.pl
$ gedit HelloWorld.pl &
Type the following into the file:
#!/usr/bin/perl
# this is a single-line comment
print "Hello, world!\n";
(Skip this step if in Rm124.) Now let's make the script executable:
chmod a+x HelloWorld.pl
Run the script:
./HelloWorld.pl
Simple variables -- like strings and numerics -- are denoted by a dollar sign ($). Try this out in your
HelloWorld.pl
script:
$a = "hello";
$b = "fred";
print "$a, $b!\n";
$c = $a + $b;
print "$c\n"; #$a and $b were treated as ints by '+' operator
$d = $a . ", " . $b . "!"; #the proper way to concatenate two strings
print "$d\n";
Now that you've seen strings, let's try numeric variables:
$e = 5;
$f = 16;
#which one's actually doing arithmetic?
print "$e - $f\n";
print $e - $f . "\n";
print $f / $e . "\n"; #support for floating point values!
Start a new script called
Arrays.pl
, and populate it with this starter code:
#!/usr/bin/perl
@arr = ( "apples", "bananas", "cookies" );
print $arr[0] . " " . $arr[2] . "\n"; #Java-like way to construct string
#for printing
print "$arr[0] $arr[2]\n"; #faster!
As you can see, arrays are indexed starting at zero like the other languages you've seen. Array declarations -- as in
@arr = ...
-- have the
@
symbol before the name of the array. Why the
$
when accessing an element, as in
$arr[0]
? Because each element in the array is a
scalar
value (think simple variable). In other words, when pulling a string or number out of an array, use
$
, as in
$arr[1]
.
You say that
cookies
don't belong in this array?
$arr[2] = "cherries";
The (a) length of an array and the (b) last-used index can be obtained this way:
$arrLength = @arr;
print "length of arr is $arrLength\n";
print "last index in arr is $#arr\n";
for()
loops
You have a few ways to iterate through an array to access its elements. Perl lets you simply print the array elements if you wish:
print "@arr\n";
Here's a
for-each
loop:
for $value (@arr) {
print "$value\n";
}
And for the traditionalists:
for( $i = 0 ; $i <= $#arr ; $i++ ) {
print "$arr[$i]\n";
}
Note the stopping condition above -- you want
$i
to reach
and equal
the last index,
$#arr
.
Using
push()
adds an element to the end of the list. Try this:
print "Before:\t@arr\n";
push(@arr, "donuts");
print "After:\t@arr\n";
See if you can figure out how
pop()
and
shift()
work:
# POP EXAMPLE ################################################
@friends = ( "Alice", "Bob", "Charlie", "Dino", "Ed", "Fred" );
print "@friends\n"; #starting state of @friends
$name1 = pop @friends;
print "$name1\n";
print "@friends\n\n\n";
# SHIFT EXAMPLE ##############################################
@enemies = ( "Vinny", "Will", "Xu", "Yan", "Zed" );
print "@enemies\n"; #starting state of @enemies
$name2 = shift @enemies;
print "$name2\n";
print "@enemies\n";
while()
Loops
for()
loops were covered in the last section. Here's a while loop for you to try:
$i = 1;
while($i <= 10) {
print "$i";
if ( $i < 10 ) {
print ", ";
} else { #$i is 10
print ".\n";
}
$i++;
}
if()-elsif()-else
First, a quick note about booleans: There are no
true/True
and
false/False
boolean constants as you've seen in other languages. In Perl, it's common to use
0
for false and any other number (though typically
1
) for true.
Write a new Perl script called
Conditionals.pl
, populate as follows, and run:
#!/usr/bin/perl
print "Enter an integer (positive, negative, or zero): ";
$num = <STDIN>;
if ( $num < 0 ) {
print "NEGATIVE\n";
} elsif ( $num > 0 ) {
print "POSITIVE\n";
} else {
print "PROBABLY ZERO\n";
}
See what happens when you enter a non-integer value, like
muffin
.
A function (or
subroutine
as they're sometimes referred) is denoted by the keyword
sub
. Try this example in a new script:
#!/usr/bin/perl
#return the larger of two incoming values
sub max {
if ( $_[0] > $_[1] ) {
return $_[0];
}
return $_[1];
}
$height = &max(73, 68);
print "The taller man is $height inches tall.\n";
A few notes here:
sub max
line.)
@_
. The first incoming value is always
$_[0]
. In the example above, the second incoming value was
$_[1]
.
max()
function was done via
&max(73, 68);
.
Let's make our program from the last section more robust, forcing the user to enter a valid integer. Make your program look like this and run it:
#!/usr/bin/perl
sub isInteger {
$val = $_[0]; #value sent to this function
if ( $val =~ /^-?\d+$/ ) { #a regular expression -- coming soon!
return 1; #true
} else {
return 0; #false
}
}
do {
print "Enter an integer (positive, negative, or zero): ";
$num = <STDIN>;
} while(! &isInteger($num) );
if ( $num < 0 ) {
print "NEGATIVE\n";
} elsif ( $num > 0 ) {
print "POSITIVE\n";
} else {
print "PROBABLY ZERO\n";
}
At the risk of making this tutorial too long, some topics will be live-demonstrated. Those topics include
Here is a Java program to print strings in an array that appear to have 5-digit zipcodes:
public class FindZipCodes {
public static String[] text = {
"It was a pleasure meeting you today!",
"Please contact me if you have any questions:",
"1001 Cayuga Ave,",
"San Francisco, CA 94112",
"\n",
"xoxoxo,",
"MF" };
public static void main(String[] args) {
//print lines of text having 5 digits in a row:
for(String str : text) {
if( hasFiveDigits(str) ) {
System.out.println( str );
}
}
}
public static boolean hasFiveDigits(String s) {
for(int i = 0 ; i + 4 < s.length() ; i++) {
if( Character.isDigit(s.charAt(i)) &&
Character.isDigit(s.charAt(i + 1)) &&
Character.isDigit(s.charAt(i + 2)) &&
Character.isDigit(s.charAt(i + 3)) &&
Character.isDigit(s.charAt(i + 4)) ) {
return true;
}
}
return false;
}
}
Regular expressions allow us to describe a pattern like 5-digits-in-a-row much more succinctly:
import java.util.regex.Pattern;
public class FindZipCodesRegEx {
public static String[] text = {
"It was a pleasure meeting you today!",
"Please contact me if you have any questions:",
"1001 Cayuga Ave,",
"San Francisco, CA 94112",
"\n",
"xoxoxo,",
"MF" };
public static void main(String[] args) {
//print lines of text having 5 digits in a row:
for(String str : text) {
if( Pattern.matches(".*\\d{5}.*", str) ) {
System.out.println( str );
}
/* alternatively:
if( Pattern.matches(".*[0-9]{5}.*", str) ) {
System.out.println( str );
}
*/
}
}
}
Perl does regular expressions, but better:
#!/usr/bin/perl
@text = ("It was a pleasure meeting you today!",
"Please contact me if you have any questions:",
"1001 Cayuga Ave,",
"San Francisco, CA 94112",
"\n",
"xoxoxo,",
"MF");
for $s (@text) {
if ( $s =~ /\d\d\d\d\d/ ) {
print "$s\n";
}
# alternatively:
# if ( $s =~ /\d{5}/ ) {
# print "$s\n";
# }
}
Here are some simple regular expression examples. Save the following script into a file that we can run. We'll keep moving the
exit
statement down as we work.
#!/usr/bin/perl
$str1 = "heidi";
if( $str1 =~ /^.....$/ ) {
print "true\n";
} else {
print "false\n";
}
exit; #we'll move this statement down after each case we review.
if( $str1 =~ /^..$/ ) {
print "true\n";
} else {
print "false\n";
}
if( $str1 =~ /^../ ) {
print "true\n";
} else {
print "false\n";
}
if( $str1 =~ /^H/ ) {
print "true\n";
} else {
print "false\n";
}
if( $str1 =~ /^h/ ) {
print "true\n";
} else {
print "false\n";
}
if( $str1 =~ /^[Hh]/ ) {
print "true\n";
} else {
print "false\n";
}
if( $str1 =~ /^[Hh].i/ ) {
print "true\n";
} else {
print "false\n";
}
if( $str1 =~ /^[Hh].i/ ) {
print "true\n";
} else {
print "false\n";
}
if( $str1 =~ /i$/ ) {
print "true\n";
} else {
print "false\n";
}
$str2 = "Mississippi has lots of letters.";
if( $str2 =~ /iss/ ) {
print "true\n";
} else {
print "false\n";
}
if( $str2 =~ /iSs/ ) {
print "true\n";
} else {
print "false\n";
}
if( $str2 =~ /iSs/i ) {
print "true\n";
} else {
print "false\n";
}
if( $str2 =~ /ississ/ ) {
print "true\n";
} else {
print "false\n";
}
if( $str2 =~ /z/ ) {
print "true\n";
} else {
print "false\n";
}
if( $str2 =~ /z*/ ) {
print "true\n";
} else {
print "false\n";
}
if( $str2 =~ /z+/ ) {
print "true\n";
} else {
print "false\n";
}
if( $str2 =~ /s{2}/ ) {
print "true\n";
} else {
print "false\n";
}
if( $str2 =~ /s{3}/ ) {
print "true\n";
} else {
print "false\n";
}
$str3 = "94112-8238";
if( $str3 =~ /\d\d\d\d/ ) {
print "true\n";
} else {
print "false\n";
}
if( $str3 =~ /^\d\d\d\d\d$/ ) {
print "true\n";
} else {
print "false\n";
}
if( $str3 =~ /^\d\d\d\d\d-/ ) {
print "true\n";
} else {
print "false\n";
}
if( $str3 =~ /^\d{5}-/ ) {
print "true\n";
} else {
print "false\n";
}
The vi editor your teacher uses supports regex; examples:
/
#
, etc.
Web servers are programs running on servers that listen for requests from web browsers (and other programs). Each time you visit a URL, a web server serves up content (HTML, images, etc.) and simultaneously logs the activity.
Let's watch live what the Apache web server does when you visit web pages it's serving up. Click the link
when asked
:
http://apcs02/
. (
tail -f
time...)
Our goal is is to write a Perl script that can report to us the various IP addresses that have visited a particular web page on the server.
Notes for your teacher:
$logFile
and
$url
hardcoded
open(INPUT,"<$logFile") || die("..."); ... close INPUT;
while($line = <INPUT>) { print $line; }
$url
grep()
-- show
man perlfunc
grep, awk, sort