JDT |
John Dixon |
Using Perl and Regular Expressions to Process ASCII Files - Part 3 |
||||
|
In Part 1 we had a quick look at what Perl and regular expressions are, and introduced the idea of using them to process HTML files. In Part 2 we developed a simple Perl script to process a single HTML file. In this part we'll look at how to process multiple files, which is often where this kind of processing comes into its own. The script we looked at in Part 2 (script1.pl - repeated below for convenience) has one major drawback, making it virtually unusable in real terms: the name of the web page (HTML file) that the script processes is hard coded into the script itself. For the script to be useful, we need to be able to run it on any web page. Changing the script so that it can do this is fairly straightforward. Below, I've given two scripts: script1.pl, which was our original script from Part 2, and script2.pl, which is a new script that will process multiple files. script1.pl1 open (IN, "file1.htm"); script2.pl
1 foreach $file (@ARGV) {
Before looking at each line of the script in detail, let's just quickly establish what script2.pl does. Well, it processes one or more files entered at the command line prompt (for example, the MS-DOS prompt) and then, for each file entered, the script initially makes a backup copy before changing every occurrence of <h1> to <h1 class="big">. A couple of definitions: Variable Array Let's take a look at each line of script2.pl. Line 1 Line 2 Line 3 Line 4 Note: file1.htm.bak will contain the contents of the file from before the script is run. file1.htm will contain the updated contents, that's to say, the output from the script. Line 5 Line 6 See Part 2 for a full description of the actual regular expression. Line 7 Line 8 Lines 9 and 10 Line 11 Running the ScriptTo run the script, at the command line type: C:>perl script2.pl file1.htm If the script executes successfully, a new file should be created called file1.htm.bak, which is a backup of the orginal file (ie before it was processed). A new version of file1.htm should also have been produced, containing the modified <h1> tag. Author: John Dixon Go to Using Perl and Regular Expressions to Process ASCII Files - Part 1 Go to Using Perl and Regular Expressions to Process ASCII Files - Part 2 Go to Using Perl and Regular Expressions to Process ASCII Files - Part 4 Go to Using Perl and Regular Expressions to Process ASCII Files - Part 5 Go back to Perl Tutorials home page Go back to Tutorials home page
|
|
|
|||||
|
© 2007-2009 - John Dixon Technology Ltd |
|||||