perlsOfLondon

 

Datatypes

Page history last edited by dmlond 2 yrs ago

Most computer languages are strongly typed, and offer many different types of containers for data. These languages require you to know exactly what kind of data is going to go into a particular container at run time (or, at least know how to cast to the known data types later on). Perl is very different. It is not strongly typed. It has only three datatypes (although it has some other things which will hold data like filehandles, which have specific uses).

  • scalars: This is the fundamental container for data in perl. All other datatypes are built with groups of scalars. A scalar can hold all data that you can imagine, without you needing to predefine what type of data it is. Other than data, the scalar can also hold a reference to another scalar. References behave like pointers in c, but, unlike in c, there is absolutely no way to do pointer arithmetic (and get it horribly wrong) in perl. You can think of a scalar as either holding a 'thing' or pointing to a container which holds a 'thing' (or pointing to a container which points to another container which holds a thing, ad infinitem). Perl's Object Oriented functionality is based upon references. We will cover references in more detail in another class, as they are very important.

    my $scalar = 'a string';
    print $scalar;

    my $scalar = 1;
    my $scalar2 = 2;
    my $scalar3 = $scalar + $scalar2;

  • arrays: Arrays are ordered lists of scalars, with indices beginning at zero. Remember that scalars can be references.

    my @array = ('a',1,"a string"); # single and double quotes are pretty much the same, except single quotes prevent variable interpolation.
    print $array[0];

    my @array = qw(a b c d e);
    print $array[3];

    #assigning a list to a scalar traps the count of elements in the list.
    my $numelements = @array;
    print "There are $numelements things in the array\n";

    my @array_of_arrays = (
      ["a","b","c","d"],
      ["e","f","g","h"]
    );

    print $array_of_arrays[0]->[2]; # prints "c". This is actually an array of references to arrays.

    foreach (@array[1,3,5]) {
      #Above is an array slice
      print "ODD ELEMENT $_\n";
    }

    For an array @array, the variable $#array holds the number of the last defined element in @array (-1 if array is empty).

    my @things = ();
    print $#things."\n"; # prints -1
    @things = qw(a b c d);
    print $#things."\n"; # prints 3, eg array[3] contains d
    my @array = (); $array[5] = 6; print $#array."\n"; # prints 5, see autovivification later.
  • hashes: Also called Associative Arrays, these are the swiss-army knife data types of the perl language. They are unordered lists of key/value pairs. Keys must be strings, or numbers, or scalars containing strings or numbers (no references or objects). Also, Keys cannot be duplicated. Any time you assign to the value of a hash for a particular key, you overwrite what was there before. This can be handy for things like creating a list of unique elements from a list with duplicates. Hash values are scalars (eg, anything). Note, underneath the hood, hashes are arrays, though there is no way for you to influence the order that key/value pairs are stored.

    my %hash = ("key1", "value1", "key2" => "value2");
    print $hash{"key1"}; # prints value1

    To make things prettier, perl provides the => symbol, which is a comma, but conveys more meaning

    my $key1 = "key1";
    my $key2 = "key2";

    my %hash = (
      $key1 => $something,
      $key2 => $something_else
    );

    print "$hash{$key1}\n";

    my %hash_of_arrays = (
      $key1 => ['a','b','c','d'],
      $key2 => ['e','f','g','h']
      );

    print "$hash_of_arrays{$key1}->[2]\n"; # again, really a hash of references to arrays

    my %hash_of_hashes = (
      'first_hash' => { 'key1' => 'a', 'key2' => 'b'},
      'second_hash' => { 'key1' => 'c', 'key2' => 'd'}
      );

    print $hash_of_hashes{'second_hash'}->{'key2'}; # prints c.

    foreach (@hash{"key1","key3","key5"}) {
      # Above is a hash slice (remember, hashes are really arrays
      print "ODD KEY VALUE $_\n";
    }
    my %seen = ();
    my @newvalues = ();
    foreach $value (@array) {
      push @newvalues, $value unless ($seen{$value});
      $seen{$value} = 1;
    }

    print "THE FOLLOWING KEYS WERE SEEN:\n";
    print keys %seen;
    print "\nTHE FOLLOWING VALUES WERE SEEN\n";
    print @newvalues;
    print "\n";

  • Everything is eventually a Scalar: A general rule in perl is that, eventually, you will get access to a piece of data as a scalar. If you look at the above examples, you will notice that, in all cases, the '$' is used to access data, regardless of whether the data is in a scalar, array or hash. This continues to be true, in somewhat modified form, for references as well. The only variation you might see in the wild is when people use array/hash slices (to be discussed in more detail later) to access individual elements, but, note, this is almost always much less efficient, and much more memory intensive.

Comments (0)

You don't have permission to comment on this page.