JDT |
John Dixon |
Using Perl and Regular Expressions to Process ASCII Files - Part 2 |
||||
|
In Part 1 we had a quick look at what Perl and regular expressions are, and introduced the idea of using them to process HTML files. In this part we'll develop a Perl script to process a simple HTML file. Suppose we have the following HTML file, called file1.htm: <html> Now imagine that we want to change both occurrences of <h1>heading</h1> to <h1 class="big">heading</h1>. Not a major change and something that could be easily done manually or by doing a simple search and replace. But we're just getting started here. To do this, we could use the following Perl script (script1.pl): 1 open (IN, "file1.htm"); Note: You don't need to enter the line numbers. I've included them simply so that I can reference individual lines in the script. Let's look at what the script does. Line 1 Line 2 Line 3 Line 4 Looking at Line 4 in more detail:
Line 5 Line 6 Lines 7 and 8 Running the ScriptAs the purpose of this article is to explain how to use regular expressions to process HTML files, and not necessarily how to use Perl, I don't want to dwell for too long on how to run Perl scripts. Suffice to say that you can run them in various ways, for example, from within a text editor such as TextPad, by double-clicking the perl script (script1.pl), or by running the script from an MS-DOS window. (The location of the Perl interpreter will need to be in your PATH statement so that you can run Perl scripts from any location on your computer and not just from within the directory where the interpreter (perl.exe) itself is installed.)So, to run our script we could open an MS-DOS window and navigate to the location where the script and the HTML file are located. To keep life simple I've assumed that these two files are in the same folder (or directory). The command to run the script is: C:>perl script1.pl If the script does work (and in theory it should), a new file (new_file1.htm) is created in the same folder as file1.htm. If you open the file you'll see the the two lines that contained <h1> tags have been modified so that they now read <h1 class="big">. Author: John Dixon Go to Using Perl and Regular Expressions to Process ASCII Files - Part 1 Go to Using Perl and Regular Expressions to Process ASCII Files - Part 3 Go to Using Perl and Regular Expressions to Process ASCII Files - Part 4 Go to Using Perl and Regular Expressions to Process ASCII Files - Part 5 Go back to Perl Tutorials home page Go back to Tutorials home page
|
|
|
|||||
|
© 2007-2009 - John Dixon Technology Ltd |
|||||