perlsOfLondon

 

Subroutines

Page history last edited by dmlond 2 yrs ago

  • Some Useful Builtins
    • print: instructs perl to print a LIST (remember what these are) of strings to an output FILEHANDLE. In most cases, your LIST will be a one element implicit list (e.g. it will just be a string).
      • print FILEHANDLE LIST: fully specifies the FILEHANDLE, either by its name passed to the open function, or as a variable (containing a reference to a FILEHANDLE, or an IO::Handle object, more about these later), and the LIST to be processed.
      • print LIST: uses the currently selected output filehandle, which, unless you have learned a little more perl than you're letting on, you wont know how to change, so it will be STDOUT by default.
      • print: prints $_ to the currently selected output filehandle.
    • warn: prints a LIST of strings to STDERR, along with some helpful debug information. Useful for logging (remember that STDERR may be mapped to somewhere else, such as when running in a CGI environment on an Apache web server).
    • length: returns the length of an EXPR (usually a scalar holding a string) WILL NOT RETURN THE LENGTH OF AN ARRAY!!! (see scalar below).
    • die: passes a LIST of strings to warn, and then exits with a non-zero exit status (e.g. 'Houston, We have a problem!').
    • substr: extracts and returns a portion of some other expression (string, or variable containing a string). substr can be used as an lvalue, provided EXPR is also an lvalue.
      • substr EXPR, OFFSET, LENGTH: fully specify the expression, the starting position (zero based), and the length of the substring to extract. If OFFSET is positive, then the starting position is from the beginning of the string, but if it is negative, it moves forward from the end of the string (can be useful). LENGTH also has different behavior when it is negative, but it isnt as intuitive or useful.
      • substr EXPR, OFFSET: Get a substring of EXPR starting at OFFSET, and extending to the end of the string.
      • substr EXPR, OFFSET, LENGTH = EXPR2: assigns EXPR2 into EXPR, changing EXPR. To 'insert' EXPR2 into EXPR without substitution use LENGTH 0. OFFSET = 0, LENGTH = 0 will prepend. OFFSET = length(EXPR) will append, but its easier to use the '.' or '.=' string operators.
    • defined: returns a boolean, true if EXPR is defined (has had anything assigned to it, including 0), false otherwise.
    • undef: returns an undefined value, or undefines an EXPR (must be an lvalue, e.g. scalar, array, or hash). Useful for passing in null fields to a subroutine call, or returning the undefined value (e.g. false) from subroutines with a boolean return context. Also useful for quickly undefining a scalar, entire array, or entire hash (basically makes it so that calling defined EXPR returns false).
    • join: takes a separator EXPR, and a LIST of strings, and joins the list into a single string, with each element in the LIST separated by EXPR. The most efficient way to concatenate a bunch of strings together is to join with a null EXPR

      join(undef, "Hello ", "MR", " JONES);

    • printf/sprintf: These are useful holdovers from the C primordia of perl. They both take the same arguments. The only difference is that printf will print directly to a specified FILEHANDLE, or the currently selected filehandle (just like print), while sprintf does not really print at all, but returns what would be printed by printf as a string. You will most often use sprintf, as printf is less efficient and more error prone than print, and you can always get the same functionality with 'print FILEHANDLE sprintf FORMAT LIST'. Both take a FORMAT, and then a list of scalars as arguments. The format presupposes the length of the list of scalars, e.g. it allows you to place each scalar in the list into a specific place in the resulting string, with any formatting around each element you want. It also lets you specify a bit more about what each scalar is expected to be (eg, is it a string %s, or is it a 10 digit number with 5 decimal places precision %10.5f). It is very useful for creating preformated strings to be dynamically changed later with variable input

      my $template = "SELECT %s FROM %s WHERE name = %s AND value = %s"; # a template SQL, which will expect 4 string values

      my $sth = $dbh->prepare(sprintf($template, $field, $table, $name, $value));

      #note, the first scalar could actually be a comma separated list of SELECT fields generated with a join

      my $sth = $dbh->prepare( sprintf( $template, join(", ", @fields), $table, $name, $value));

    • chomp: this removes any line-ending characters (depending on the operating system, and also on what is in a special perl variable that I recommend you do not remember exists ;) ) from a variable, or LIST of variables. If the variable is not explicitly specified, uses $_. Remember that the variable or list of variables is permenantly changed by the chomp. Its return value is the number of characters deleted, but most people ignore its return value.
    • reverse: efficient way to reverse the order of elements in a LIST (simply swaps the indices around). In scalar context (e.g. when the lvalue is a scalar, and not a LIST), this concatenates the elements of list, then reverses it, and returns it character by character. THIS WILL NOT REVERSE A SCALAR, eg it will not turn "hello" into "olleh", unless you pass the return of split (a LIST, see later) on the scalar to reverese.
    • rand/srand: rand returns a random fractional number between 0 and EXPR. If EXPR is not present, it defaults to 1. srand generates a seed from an EXPR (default is the return of the time function) to be used by rand. This actually changes the seed for all subsequent calls to rand, so use it with caution.
    • lc, lcfirst, uc, ucfirst: lc lowercases an EXPR (default $_), lcfirst lowercases the first character of EXPR, uc uppercases EXPR, ucfirst uppercases the first character of EXPR.
    • scalar: produces a scalar value of EXPR. scalar LIST produces the length of LIST (just as if you assigned a LIST to a scalar). scalar OBJ returns the scalar rendition of the object (more on this later, really only useful for debugging).
    • time/localtime: time returns the number of non-leap seconds since the epoch. This is useful for calculating the number of seconds it takes to do some task in your program

      my $st_time = time;
      #some complex code follows...
      my $total_time = time() - $st_time;
      print "your process took $total_time milliseconds\n";

      The return of time can also be fed into various perl packages which facilitate the manipulation of dates, and even functions as a (not so secure) seed to rand.
      localtime returns a nine element LIST of parts of the time provided as an EXPR (defaults to the return of time)

      ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime;

      if called in a scalar context returns a stringified, human readable date.

      print "THIS EVENT HAPPENED AT ".scalar(localtime)."\n";

  • Writing Your Own: Subroutines are reusable pieces of code wrapped in a sub block. You must handle any arguments in @_ explicitly, or with shift. You can then call the function using its name, optionally with the '&' in front of it, followed by any arguments (optionally in a list after the function name):

    sub simpleFunc {
      print "THIS IS LOCALFUNC\n";
    }

    &simpleFunc;

    sub complexFunc {
      my ($first, $second) = @_;

      print "FIRST $first SECOND $second\n";
    }

    &complexFunc "my", "name";

    sub method {
      my $in = shift;
      my $out;
      #do something with in to produce $out
      return $out;
    }

    my $answer = &method($question);
    print "$answer\n";

    sub array_method_1 {
      my @args = @_;
      my @out;
      #do something with @args to produce @out
      return @out;
    }

    my $n = 0;
    foreach $answer (array_method($something, $another)) {
      print "ANSWER $n is $answer\n";
      $n++;
    }

    sub better_arrayref_method {
      my @args = @_;
      my @out;
      #do something to @arg to produce @out, return a reference to @out
      return \@out;
    }

    my $n = 0;
    foreach $answer (@{array_method($something, $another)}) {
      print "ANSWER $n is $answer\n";
      $n++;
    }

  • wantarray: So, just how do functions like localtime 'know' when to return a scalar and when to return an array? They use the wantarray function, which returns true if the calling context (e.g. the way the return is being used) is LIST, and false otherwise, to differentiate the return value.

    sub myStuff {
      return wantarray ? ("THIS","IS","A","LIST") : "THIS IS A SCALAR";
    }

    my @list = myStuff();
    map { print "GOT $_\n" } @list;

    my $stuff = myStuff;
    print $stuff;

    #What will the following print?
    print myStuff;

  • @_ is a single list: This is one of the first things that all new programmers have trouble with when writing subroutines. Subroutines in perl all come with the implicit @_ list containing any arguments passed to them. This is a single list of scalars. While this works for almost 95% of programming needs, there is one limitiation. Multiple arrays (or hashes) passed as arguments to a subroutine get combined into a single array (hash). The following:

    sub broken {
      my (@first, @second) = @_;
      #do something with @first and @second
      }

    Will not work as expected (What do you think will happen?). The answer is always to pass references to these, as they will be preserved as individual entities within the list.

    sub fixed {
      my ($first_list, $second_list) = @_;
      #do something with @{$first_list} and @{second_list}
    }

  • subroutines return single lists: A similar constraint holds for the return of a subroutine. Unless multiple lists are passed back as references, a subroutine will squash them together into a single list on return.

    sub broken { return qw(A B C D), qw(E F G); }
    my (@a, @b) = broken();
    print "A IS ".join(" ", @a)."\n";
    print "B IS ".join(" ", @b)."\n";

    sub fixed { my @a = qw(A B C D); my @b = qw(E F G); return \@a, \@b; }
    my ($a, $b) = fixed();
    print "A IS ".join(" ", @{$a})."\n";
    print "B IS ".join(" ", @{$b})."\n";

  • Recursion: like any powerful programming language, perl makes it possible to do recursive subroutines. These are subroutines which repeatedly call themselves with a smaller subset of a particular problem to generate the solution to a big problem. They consist of one or more 'base' cases defining when to stop recursing, and a recursive definition for the solution to the cases which do not meet the base case. In many cases, it can be easier, and more intuitive, to design recursive subroutines to solve computational problems than iterative (looping) solutions. However, they are almost always less efficient than their iterative counterparts, and it has been proven (using complex mathematical logic that I will not cover here) that any recursive algorithm can be refactored into an iterative algorithm. Also, there are memory limits to how many levels deep a particular subroutine will be allowed to descend (if you want to find this limit, just write your recursive subroutine without a base case, and let her rip). If you have solved a Tower of Hanoi or lighthouse, etc. puzzle, you have unknowingly applied a recursive algorithm in its solution. The most famous recursive algorithm is the generation of fibonacci numbers.

    perl -le 'sub fib { my $n = shift; return $n if ($n == 0); return $n if ($n == 1); return fib($n - 1) + fib($n - 2);} map { print fib($_) } (0..6);'

    # much faster, iterative form
    perl -le 'sub fib { my $n = shift; my ($a, $b) = (0,1); while ($n--) { ($a, $b) = ($b, $a + $b); } return $a; } map { print fib($_); } (0..6)'

Comments (0)

You don't have permission to comment on this page.