Concatenate HTML Files

Web Logo

See the download page to obtain this program

Description

This script combines a number of HTML files into one. The beginning of the first file (up to and including <body ...>) is used for all the files since only their bodies are concatenated. An optional divider followed by the label of a file is used between files.

Note the following limitations. Some of these are fixable, but the author has not worked on the code for a long time.

Options

The command line options are:

-d
print divider between concatenated files
-h
print usage as help
-o file
name output file (this will be ignored if present in the input list, e.g. due to giving *.html)
-s
sort input files into case-insensitive alphabetical order (putting the index file first if necessary, and removing the file it points to from the inputs if it is a symbolic link)

Usage

Run on one or more HTML files. Warning messages are sent to standard error. Examples of usage are:

htmlcat -o some.html def.html res.html
concatenate def.html and res.html to some.html
htmlcat -d -o all.html *.html
concatenate all HTML files to all.html with dividers between them
htmlcat -o -s out.html *.html
sort then concatenate all HTML files to out.html
htmlcat *.html > /tmp/all.html
concatenate all HTML files to standard output (here /tmp/all.html); for this method, do not create a concatenated file in the same directory or the script will run indefinitely on its own output!

The only things likely to need changed for installation are the directory index filename and the nature of a file divider (see customise subroutine in the code). Change the first line of the script according to where Perl is located. Although tested with Perl5, the script may work with only minor changes for Perl4.

Licence

htmlcat is free software, distributed under the GNU Public License Version 2. You may re-distribute this software provided you preserve this README file. The contents of this package may be used freely for non-commercial purposes provided this README file and copyright notices are retained. Copyright remains with the author. No warranties are given as to the accuracy or suitability of this package.

History

First public version Ken Turner, 21st November 1998


Up Arrow Up one level to Web Utilities

Web Ken Turner Home   Email    Search Search Web Pages

Last Update: 13th May 2010
URL: http://www.cs.stir.ac.uk/~kjt/software/web/htmlcat.html