Computer Science Canada [Perl5-tut] Perl5 Primer |
Author: | wtd [ Thu Dec 02, 2004 12:47 am ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Post subject: | [Perl5-tut] Perl5 Primer | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Goal The goal of this tutorial is to introduce Perl5 to programmers who already have experience in another programming language. What is Perl? Depending on who you ask, Perl is the "Practical Extraction and Report Language", or the "Pathologically Eclectic Rubbish Lister". Either way, it's a powerful programming language that's installed on pretty much every Unix or Unix-like system, and is available for just about any platform you can imagine. Further, Perl, despite not having a large company like Sun or Microsoft strongly advocating it, has become incredibly popular. Due to the amount of work already done in Perl, and the number of people continuing to work on it, it's also a well-known name, and has a lot of power on a resume. A first program
The first two lines force us to program strict Perl, which makes it easier to avoid errors, and enables warnings, to let us know when we're doing things which could be dangerous. As in any language, warnings should not be ignored. The third line simply prints "Hello, world!" to the screen. Comments Comments in Perl are anything following a #.
Variables Variables in Perl5 in strict mode are "declared" with the keyword "my".
There are only four types of variables in Perl, and they aren't what you're probably used to. Scalars Scalars are either numbers or strings. The language goes back and forth easily between the two. String concatenation with a number and a string...
Similarly, two numbers...
Mathematical operators work much the same way.
To repeat a string, use the "x" operator rather than "*".
Arrays The second kind of variable is an array. Array variables are denoted with a leading @, just as scalars are indicated by a $. Each element in the array is a scalar. An array can contain any number of elements. Array indices start at zero. A simple array:
Accessing each inidividual elements in the array is done like so:
The @ switches to a $ since the thing being accessed really is a scalar. Inserting a value into an array is similarly simple, and you can insert at any index.
Getting the length of an array is a matter of:
Hashes The third type of variable in Perl is the hash, or associative array. It's kind of like an array, but it uses a scalar as an index. Also, hash variables are prefixed with %, and {} are used instead of square brackets when accessing elements.
References References are a fancy twist on scalars, and can go anywhere a scalar can. Instead of a number, or string, though, they hold references to other variables (scalars, arrays, hashes). The \ operator gets a reference.
Of course, we can also get reference to arrays and hashes with \...
But, it's easier if we just directly create the reference using different brackets.
Dereferencing a reference is easy.
For simple examples like these, we can elide the brackets.
And if we want an element from an array or hash:
The real power of references is the ability to create arrays and hashes that contain other arrays or hashes.
String interpolation Few little things are more useful in Perl than string interpolation. I could write my "Hello, world" using the stringconcat operator:
Or I could just put the variable into the string...
Comparisons Comparing numbers is done with:
While strings are compared with:
Other variables can be compared with either. Three way comparisons can be done with <=> for numbers and "cmp" for strings. If the two things being compared are equal, 0 is returned. If the left hand thing is greater, 1 is returned. Otherwise -1 is returned. Either form is acceptable for anything else, and "cmp" can be used with numbers, though <=> is usually the preferred form. What's true and what's false? It's easier to just say the things that are false.
What does an "if" statement look like?
When it's more convenient, we can write:
There's also a shortcut for "if not":
Connecting conditions Those familiar with C, C++, Java, etc. should be familiar with:
Or the equivalent:
For loop The C/C++/Java style for loop works just fine in Perl to allow us to count up from one number to another:
A simpler, form, though is:
Also:
This can also be used to loop over an array.
Or:
Or even:
The for loop can also be used to loop over the elements in a hash:
Subroutines Subroutines in Perl5 correspond to functions or procedures in other languages. One noteable difference is that they do not have parameter lists. A basic subroutine looks like:
Calling it is as simple as:
In most situations the & can be elided, leaving:
Return Of course, subroutines can return values.
They can return more than one thing, in fact.
There are several ways we can use this output. Assigning it to a single scalar:
Results in 3 being stored in $bar. Assigning it to an array: If the $bar is in parens:
Then $bar is now 1.
Then it's equivalent to:
You can also assign each value to an array.
$a is 1, $b is 2, and $c is 3. And we can mix the two:
$a is 1 and @b is (2, 3). Parameters I mentioned that Perl subroutines don't have formal parameters lists. This does not mean that you can't pass parameters to a subroutine. They just end up in the @_ array. Consider a subroutine "greeting" which takes the name of a person to greet.
When we call a subroutine with a parameter, we can either use parentheses or not. Using them simply makes it easier to understand which parameters go with which subroutine.
Or:
Subroutine References In addition to getting references to scalars, arrays and hashes, we can get references to subroutines.
It's worth noting that if we try:
Then we get a reference to whatever foo returns. A more convenient syntax for directly getting a subroutine reference is:
Dereferencing a subroutine reference looks familiar:
Or, we can call it directly.
Map, Grep , and Sort: invaluable tools for dealing with arrays Let's consider the case where want to create a new array based on an old array, but with each item in the array multipled by two.
"push" literally pushes a new item onto the end of an array. The rest is pretty self-explanatory. But, there's an easier way.
"map"... well it maps each item in the input array to the code in the brackets. $_ is a "magical" variable which represents each item in the array. Another case: we want to get each item in an array that's less than 42.
Granted we could make this a bit more concise with:
But an even better solution is:
Let's consider a case where we have an array of array references:
Each item in the array represents a name. What if we want to sort by last name? The straightforward solution is to build an array of last names, use the built-in sort to sort that array. Then we can use that array to build a new array of sorted names. Of course, we need to find a way to eliminate redundant last names. Hashes can only use one key once, so if we make the last names keys in a hash, we can eliminate redundancy.
Intimidated? I would hope you would be. That's some mind-blowing code. Now, let's see how the capabilities of sort can make this much easier:
$a and $b are two adjacent items in the array. How they get swapped depends on a comparison between them. In this case we just do a standard comparison, but on the last name in each case. |