perlsOfLondon

 

play with filehandles

Page history last edited by dmlond 3 yrs ago

In the following sections, you will learn various ways to use perl filehandles to get data from files into your programs, and print data out to new files. You will also learn about unix pipes, and redirects.

The ls command, with the -l switch, prints out alot of useful information about the files in a directory (either specified as an argument, or the current working directory if none is specified). It sends its data to STDOUT. This means that it can be 'piped' to other commands, such as perl, for further processing, or redirected to files on the system, either overwriting their contents (>), or appending to them [>>). For now, we are simply going to look at a few ways of bringing the output of ls -l into a perl variable. When we talk about operators, functions, and regular expressions, we will revisit this basic idea and extend it.

# output of ls -l is written to a file 'ls.out', overwriting its contents.

shell> ls -l > ls.out
shell> cat ls.out

The cat command prints the contents of a file (hopefully text, or you will be in for trouble) to STDOUT. Try this set of commands a few times over and over to convince yourself that the file is being overwritten each time. Now try the following command 2-3 times to see the difference

shell> ls -l >> ls.out
shell> cat ls.out

# now lets use a perl program to parse the output of ls -l. First, edit a file called lsparse.pl with pico, and add the following lines:

while (my $line = <>) {
  print $line;
}
exit;

Save and exit pico. This perl program can now be used in one of two ways. You can pipe the STDOUT output from ls -l directly into it like follows:

shell> ls -l | perl lsparse.pl

If you want to be actually sure that the output is really being pipled through perl, edit the lsparse.pl file as follows, and repeat the above steps:

while (my $line = <>) {
  print "I GOT $line";
}
exit;

Save and exit pico.

Now, try this on the ls.out file you created above.

shell> perl lsparse.pl ls.out

Perl automatically assumes that the <> will be reading the contents of any file provided on the commandline arguments that arent specifically handled by the perl program itself (we will talk about this later). Now, perl is itself printing its output to STDOUT, so you can either pipe it to something else:

shell> ls -l | perl lsparse.pl | sort -u

Or you can redirect its output to a new file

shell> ls -l | lsparse.pl > newls.out

or, if you want to use the input file to perl

shell> lsparse.pl ls.out > newls.out

# Now lets use perls open function to manipulate files and pipes from within perl. First lets just open the file and print it to STDOUT, using a modified lsparse.pl file called lsparseNew1.pl

shell> cp lsparse.pl lsparseNew1.pl

pico lsparseNew1.pl

place the following lines of code in it:

open (LSINPUT "<ls.out") or die ("Cant open it $!\n");
while (my $line = <LSINPUT>) {
  print "IG GOT $line";
}
close LSINPUT;
exit;

Save and exit. Now run the program in the directory where ls.out is.

shell> perl lsparseNew1.pl

Again, this is printing to STDOUT, so you can pipe or redirect as you see fit. However, you no longer have the ability to pipe stuff into the program. Thats just something to think about when you are designing your programs. Now, lets change the program to write the contents directly to a file. This will make it such that you cannot pipe the output to other programs, but in some cases this is just fine, or absoulutely necessary.

open (LSINPUT "<ls.out") or die ("Cant open it $!\n");
open (LSOUTPUT, ">lsNew.out") or die ("cant open it $!\n");
while (my $line = <LSINPUT>) {
  print LSOUTPUT "I GOT $line";
}
close LSINPUT;
close LSOUTPUT;
exit;

Save and exit. Now run the program in the directory where ls.out is.

shell> perl lsparseNew1.pl

Note the use of the redirect symbol in both the LSINPUT (<ls.out) and LSOUTPUT (>lsNew.out) open calls. These are optional, but highly recommended. Also, if you want to append the output instead of overwriting the file, use

open (LSOUTPUT, ">>lsNew.out");

Now, notice that the open function takes two arguments. A symbol name (it can be anything you want, e.g. it is pretty much like a variable, without '$', '@', or '%'), and a string. This string can be specified directly, or it can be a scalar with a string in it, or it can be a string composed of some text around a scalar variable (remember that a scalar can also be an individual element in an array or hash, so the possibilities are endless). You could have something like:

open (MYSTUFF, "<my${hashOfStuff{$someKey}->[0]}_files.in");

Notice the use of the { around the entire 'scalar' to make sure the perl lexer doesnt get confused. Finally, what if you want to run a program in shell and pipe it into a perl program from within the program, or you want the output of the program to be parsed through some other program on the shell (which may be anything, and may then print to STDOUT again). The open function can do this as well, using the pipe symbol. Edit a file called pipeInLs.pl with pico:

open (LSINPUT, "ls -l |") or die "Cant do it $!\n";

while (my $line = <LSINPUT>) {
  print "I GOT $line";
}
close LSINPUT;
exit;

save and exit pico. Run the command

shell> perl pipeInLs.pl

Now change it to pipe the outut through the sort -u command (which itself prints to STDOUT):

open (LSINPUT, "ls -l |") or die "Cant do it $!\n";
open (SORTOUT "| sort -u") or die "Cant do it $!\n";

while (my $line = <LSINPUT>) {
  print SORTOUT "I GOT $line";
}
close LSINPUT;
close SORTOUT;
exit;

Now you know everything you need to know about filehandles (for now).

Comments (0)

You don't have permission to comment on this page.