WHAT'S IT FOR?

HTMLgen is a class library for the generation of HTML documents with Python scripts. It's used when you want to create HTML pages containing information which changes from time to time. For example, you might want to have a page which provides an overall system summary of data collected nightly. Or maybe you have a catalog of data and images that you would like formed into a spiffy set of web pages for the world to browse. Python is a great scripting language for these tasks and with HTMLgen it's very straightforward to construct objects which are rendered out into consistently structured web pages. Of course, CGI scripts written in Python can take advantage of these classes as well.

ABOUT THIS RELEASE

This is the 2.1 release. It's a minor release with a new class called TemplateDocument and several bug fixes. The TemplateDocument class provides a simple way of generating web pages based on a template containing tags which are replaced at run time with text of your choice. It uses a dictionary to provide the substitution mapping. This class is only available with Python 1.5 or newer as it makes use of the re module and the new get method on dictionaries. The rest of HTMLgen should work fine with 1.4.

A new utility module imgsize provides a function to quickly compute the width and height of image files. It makes use of a small subset of Fredrik Lundh's excellent PIL package to support GIF, JPEG, and PNG file format headers. If you have PIL installed at your site, that copy is used automatically instead of the modules bundled with HTMLgen. The Image object in HTMLgen makes use of this new capability to automatically compute the images size properties if the files exist when HTMLgen is running. Another utility script is bundled with HTMLgen called imgfix.py. It makes use of the imgsize routine to scan HTML files looking for IMG tags without correct height/width properties and fixes them. See imgfix.py source for instructions on its use.

The simple HTML tag classes all now inherit from AbstractTag or AbstractTagSingle. The former base class provides generic support for HTML element attributes and container functions such as append, prepend, copy, and a new method called markup. The markup method is available to scan through the text contained in the object with a regular expression you furnish and perform a string substitution on the matching text. The method accesses two external functions - one written to support syntax of the regex module and the other to support syntax and objects from the new re module. For example, once you have a Para object "P" you could substitute all bracketed text with a bold font as follows:

p = Para('some text...')
strong = Strong()
p.markup( "(\[[^]]*\])", strong, regex_type="re" )

Note that the regular expression pattern or object given to this method must have a group defined. It treats group 1 as the target of the substitution. The above also demonstrates a new feature supplied by the AbstractTag class. Instances now can be used a functions through use of it's __call__ method. So you can now configure an HTMLgen class instance with all attributes set and then reuse the instance as a text markup function.

The release can be downloaded from the Starship Web site.

ARCHITECTURE

HTMLgen uses a simple model of having a separate class for each HTML element type. Each class instance supports an __str__ method to emit itself as HTML text markup. A SimpleDocument class provides the general container object which the user populates with objects from the markup classes. The SeriesDocument class also provides a general style template for page generation based on Patrick Lynch's Web Page Style Manual at the Yale Center for Advanced Instructional Media. See that page for more insights into the design of good web pages.

NOTES ON NAMING

The names I've selected for the classes in this module may need some explaining. There is a class for each of the common HTML elements and all classes use initial upper case for their name. All functions are lower case. I did not choose to use all caps for the classes that correspond to elements as A) HTML markup isn't case sensitive anyway and B) I hate typing things in all caps and C) I tend to use all-caps names to signify constants. I also modified the class names to make them more descriptive than the HTML standard. For example, I use Image rather than IMG in hopes that the longer names will be easier for the user to remember while coding the Python scripts. Also there is no A class. I use the Href class to specify hyper- references, (mostly because it's easier for me to remember). A little odd case is the Paragraph vs. P classes. The former class is used as a normal paragraph text container and uses the /P token while the latter is simply used to insert a <P> break in the output stream. Even though the class names might be 'Emphasis' rather than 'EM', I have provided aliases for these classes so that if you insist, you can use the HTML names as the class names.

DISTRIBUTION

This release comes with a mini-web of pages generated with the tool. The HTMLtest.py is the script used to build several pages in the html subdirectory. It functions both as a test case as well as usage examples. The reference manual output from gendoc-0.6 is there as well. Input data is read from the data directory and images used are all in the image directory. Once untaring the files simply run the test function:

>>> import HTMLtest
>>> HTMLtest.test()
wrote: "./html/overview.html"
wrote: "./html/document.html"
wrote: "./html/lists.html"
wrote: "./html/top-frames.html"
wrote: "./html/frames.html"
wrote: "./html/tables.html"
wrote: "./html/forms.html"
wrote: "./html/imagesmaps.html"
wrote: "./html/scripts.html"
wrote: "./html/independence.html"
wrote: "./html/parrot.html"
wrote: "./html/colorcube.html"

This should print the html files it generates as above. Browse them (well you ARE browsing them now I guess) to be sure things aren't hosed. The navigation buttons should work, etc. After you are satisfied just place HTMLgen.py* and HTMLcolors.py* into your PYTHONPATH (and any of the other modules you want to use) and have fun.

Comments are most welcome.
I've added a simple script called colorcube.py which generates a web-safe color table useful for selecting colors for your pages. Two additional sample pages are available as well. They are accessed by clicking the "Next" button on the last on-line doc page (Scripts). If you want to jump to them directly you can click here. If you're curious about the insect at the bottom of each page, that's my alma mater's mascot.