| |
Regular Expressions 101
Page history last edited by dmlond 2 yrs ago
- What Are Regular Expressions: I really cant do any better than Wikipedia on this one.
- What Are Perl Regular Expressions: Wikipedia also has a pretty good discussion of Perl Regular Expressions . Perl created a whole class of regexpes, which have, since, been ported to almost every other programming language, because of their utility. One critical thing to remember with regard to regular expressions, the '*' metacharacter matches zero or more occurences of a particular match. Because this will match the empty string, this can be a bit nonintuitive, and lead to strange, hard to find bugs. You can get around this by first testing length, or using the + metacharacter instead, or not using a metacharacter at all (but this only matches one occurance).
- Matching: Perl uses regular expressions in pattern matching operations useful for tests within flow control statements. This is accomplished using a binding operator '=~' to bind a scalar to its match. Matches can be a simple as a regular expression between two slashes regexp, but normally, at least, have the m in front of the slashes to more explicitly denote what is going on, m/regexp/. Also, the slashes are optional. You can use any character to open and close a regexp. This can be handy when the slashes are part of the pattern you want to match, and you dont want to use escape characters.
if ($scalar =~ m/\w+\:\w+/) { ... }
if ($string =~ m#/(\w+)/#) { print "Directory $1\n"; }
Also, a scalar can be matched to a regexp with group operators outside of a test, and the backreferences can then be utilized within the scope of the block with the match.
foreach my $line (<IN>) { $line =~ m/(\w*)\_(\d+)/; print "FIRST $1 SECOND $2\n"; }
- Substitution: perl also uses regular expressions to allow substitutions of data for specific pieces of data. This can be accomplished using a substr as a lvalue, but it is often much more useful to code the substitute the data directly, rather than have to code the coordinates. Sometimes the piece of data that you want to substitute occurs in varying parts of a line of data as you iterate through all the data to be processed, or, as is more often the case, the data to be substituted occurs multiple times, and you want to replace all occurences. Substitution is accomplished by binding a scalar to a substitution pattern s/pattern/replacement/. Again, slashes can be replaced with other characters (including braces,
s{   # comment   part of pattern   # comment   part of pattern }{   replacement }
). There are a few modifiers which effect the way the substitution works. By default, substitution works by replacing the first occurence of the pattern with the replacement, and stopping. The 'g' modifier, s/pattern/replacement/g, makes the substitution occur at all places where the pattern occurs. There are others, but this one is the most useful. Also, remember that grouping with backreferences work as well.
#substitute the first of occurance of FOO with BAR $string =~ s/FOO/BAR/;
#substitute all occurances of FOO with BAR $string =~ s/FOO/BAR/g; #substitute all occurances of FOO into barFOO using a group and backreference $string =~ s/(FOO)/bar$1/g;
- Splitting: The perl split function takes a regular expression as its split token. In most cases, you will use very simple regular expressions in split. The return from split is a LIST, and it does not include the split token, unless you include it in a group as above.
# split out all of the individual directories from PATH my @dirsInPath = split /\:/, $ENV{"PATH"};
Because split returns a LIST, it can be used as an implicit LIST, which can be sliced, and diced, etc.
#a replacement for awk which prints the modification time from ls -l shell >ls -l | perl -nle 'print join(" ", (split /\s+/)[5,6,7]);'
Regular Expressions 101
|
|
Tip: To turn text into a link, highlight the text, then click on a page or file from the list above.
|
|
|
|
|
Comments (0)
You don't have permission to comment on this page.