Next Previous Contents

8. Release notes

- Accessibility checks according to the Open Accessibility Checks
  Project (OAC) by the University Of Toronto (http://oac.atrc.utoronto.ca/).
  Supported checks:
    - OAC #1: missing ALT
    - OAC #2: ALT is the same as the file name
    - OAC #3: ALT text is not shorter than 150 characters
    - OAC #7: ALT text can't be empty if image is used as an anchor
    - OAC #37-41: wrong Hx nesting (e.g.: h2 without h1)
    - OAC #48: document language must be identified
    - OAC #50: missing TITLE
    - OAC #51: empty TITLE
    - OAC #52: TITLE is not shorter than 150 characters
    - OAC #58: Images used in INPUT controls must have ALT text
    - OAC #59: Images used in INPUT controls must have valid ALT text
    - OAC #60: Images used in INPUT controls should have short ALT text
    - OAC #61: Image used in INPUT control - ALT text should not be
      the same as the file name
    - OAC #116: deprecated use of the B element
    - OAC #117: deprecated use of the I element
- PHP interface:
  - Added support for searching information regarding accessibility checks,
    thanks to Valentina Del Sapio (Comune di Prato)

Release notes for htcheck-1.2.2 - 13 Jan 2004
- Updated to new autotools (autoconf 2.58, automake 1.7.9, libtool 1.5)
- Standard C++ library automatic detection (removes compilation warnings)
- Database changes:
  - New fields stored:
    - URL's doctype for HTML documents (Url table)
    - HTML documents' description and keywords (Url table)
- PHP interface:
  - Added doctype field for URLs query
  - Added description and keywords fields for URLs query
- Fixed minor bugs including:
  - Correct negotiation of the accepted encodings with the HTTP server
  - Charset recognition when it is given through the Content-Type HTTP header
  - Automatic recovery mechanism when a HEAD call fails with some Web servers (bug #870467)


Release notes for htcheck-1.2.1 - 27 Apr 2003
- Cookies input file management, which allows to import cookies in
  ht://Check's jar and preload them before a crawl starts
- A link's description is now stored in the database, allowing
  to see which text has been used when issuing a link
- Also, it is possible to see which tags are included inside a link: this
  is useful, for instance, to see which images act as buttons.
- added the 'store_link_info' attribute, which allows
  to control the storing of the link descriptions and linked tags.
- added the 'available_charsets', which allows to check URLs against
  a set of predefined charsets.
- fixed a serious bug which prevented referring URL to be correctly set
- code updated for new autotools (autoconf 2.57, automake 1.6.3 and libtool 1.4.3).
- minor changes.
- Database changes:
  - New fields stored:
    - URL's Charset (Url)
    - Link's description (HtmlStatement)
    - Link's position of the tag (HtmlStatement)
- PHP interface:
  - Automatically works with 'register_globals' off
  - Charsets management
  - Lighter layout without most of the deprecated HTML elements and attributes
- Successfully compiled and installed on:
  - [x86] Linux 2.4 (Redhat 8.0)
  - [x86] Linux 2.4 (Redhat 7.3)
  - [x86] Linux 2.4 (Debian 2.2)
  - [x86] FreeBSD (4.7-STABLE)
  - [Alpha] Linux 2.4 (Debian 3.0)
  - [PPC - G4] MacOS X 10.1 SERVER Edition (statically linked)
  - [Sparc - Ultra60] Linux 2.4 (Debian 3.0)


Release notes for htcheck-1.2.0 - 16 Sep 2002
- added the 'store_url_contents' for storing the content of an HTML document
- added the Proxy Authorization support ('http_proxy_authorization')
- Keep trace of the bad encoded URLs through the 'url_reserved_chars' attribute
- Cookies are now handled as both the RFC2109 and Netscape say
- internal URLs are distinguished by external ones and the info is now stored
- HTML's 'id' attribute is now used for anchors, besides the 'name' attribute
- added the 'db_name_prepend' attribute for setting the string to
  be prepended to every database created by htcheck (also manageable
  through the 'with-db-name-prepend' configure option)
- added the 'remove_default_doc' attribute for removing the default document
  for a directory index
- added the '-k' feature for dropping just the tables, not the whole db
- Database changes:
  - New fields stored:
    - URL's content (Url)
    - HTML statement's row (HtmlStatement)
    - Server's IP address (Server)
    - Cookie version (Cookies)
- PHP Interface:
  - safer against XSS (cross-site scripting) attacks
  - Show the source of an HTML file
  - Filter for anchors now added to the links form
  - Added the support for 'tidy' (tidy.sourceforge.net) which allows to
    show the warning, errors and suggestions provided by this validator
- fixed some other minor bugs and made the code more robust


Release notes for htcheck-1.1 - 18 Feb 2002
- HTTP code now handles the language negotiation, through the
   'accept-language' attribute of the configuration file
- More robust support of cookies with the management of the domain attribute
- Cookies are now stored in the database (Cookies table)
- builds under GCC3
- fixed a bug regarding the BASE tag handling
- fixed some other minor bugs
- PHP Interface:
   - German language file added (thanks to Michael Stenitzer <stenitzer@eva.ac.at>)
   - some Web structure mining indexes have been added
   - display of the content language of a URL as given by the server
   - cookies simple report in the database home page
   - some cosmetic changes
   - code now has only the 'php' extension and works without the ASP tags setting


Release notes for htcheck-1.1.0b9-klunk - 25 Jun 2001
- Database structure now improved and compressed; less storage
  space and more speed in queries.
- Indexes of the Link table are created at the end of the crawl,
  improving performances, and controled by the 'url_index_length' parameter
- 'url_index_length' configuration attribute has been added: this
   attribute allows the user to control the length of the index
   for the Url field in the Schedule and Url tables. This attribute
   may affect the performance of the crawls, as long as the length
   of an index can either slow down or speed up the spidering process.
- Cookies summary (with -s option)
- POSIX standard: --version and --help compatible (with getopt_long)
- libtool 1.4 support
- fixed many bugs regarding the parser of the spider, which is now more robust
- cleaned code inside the 'core' source files
- PHP Interface:
   - Automatic and manual choosing of ht://Check databases
   - Javascript URLs query support
   - Description of a connection trouble when a URL is not retrieved
   - Fixed minor bugs and done cosmetic changes


Release notes for htcheck-1.1.0b8-muttley - 27 Apr 2001
- Finally runs on Solaris
- MySQL 3.23.xx users: now datetime fields are stored properly
- Link to e-mail are now stored and can be seen
- Link with a 'file:/' call are now considered as errors
- User Agent now shows the version and the platform
- Fixed a bug regarding the HTML parser with (very) malformed tags
- Fixed many minor bugs
- PHP Interface:
   - Enhancements: retrieve e-mail links
   - Fixed some bugs


Release notes for htcheck-1.1.0b7-anaconda - 28 Mar 2001
- Fixed library versioning
- Man page now provided (thanks to Marco Nenciarini <mnencia@prato.linux.it>
- Static linking now works fine
- New library architecture in order to provide no conflict with ht://Dig; they
  are all 'package' libs instead of global libs.
- 'optimize_db' has now been set to false by default
- PHP Interface:
   - PHP3 compatibility issued
   - removed .inc extension as PHP source


Release notes for htcheck-1.1.0b6-zizou - 12 Mar 2001
- HTTP Cookies support now enabled
- New type of link result: 'Not authorized'
- Fixed configuration error for load_mysql_defaults function and
  raised by Free BSD users.
- disable_cookies attribute added in the configuration
- Update of the HtDateTime class according to ht://Dig's one
- PHP interface:
  - better output
  - added images for link results
  - bug in qryurls.php and listlinks.php has been fixed
  - css file added for content visualization
  - dynamic language detection (english or italian for now)
- small bugs fixes

  
Release notes for htcheck-1.1.0b5-flukekelso - 24 Jan 2001
- Fixed a bug in the database initialization
- Default MySQL authentication (through /etc/my.cnf or ~/.my.cnf file)
- 'OBJECT' HTML tag now correctly parsed
- Basic HTTP Authentication enabled
- PHP interface improvements:
  - English and italian languages available
  - Get info regarding URLs by choosing through a form lots of parameters
    (i.e. URL, status code values, content-type, size and title if present)
   - Other small enhancements
- Documentation started
- Fixed other minor bugs


Release notes for htcheck-1.1.0b4-utero - 07 Sep 2000
- Now ht://Check uses MySQL's option file in order to get connection
  information such host, user, password, port and socket.
- HTTP Proxy support (to be tested more deeply)
- PHP interface's improvements:
  - It's now possible to look for broken links and anchors not found by
    using the form in listlinks.php. Filter can now be made with the
    LinkResult as well as the LinkType (and the referencing and referenced
    URLs like before).
- Fixed a bug regarding SGML entities with anchors and the "#top" anchor
  is now considered as valid.
- Sources have now been cleaned from most of the compilation warnings.


Release notes for htcheck-1.1.0b3-utero - 22 Aug 2000
- Better summary of the broken links (more complete and reliable).
- HTML anchors check is now performed and a field (LinkResult) has
  been added. It contains info about the link, if it's ok, broken,
  redirected and if a anchor is present and not found it warns about it.
- Summary of anchors not found, enabled or disabled through the
  configuration attribute 'summary_anchor_not_found'.
- The table 'htCheck' has been added to the database: its purpose
  is to store the general info of the crawl (user, start time, end
  time, etc ...).
- Added 'optimize_db' configuration parameter for optimizing the tables
  of the database. Default is true.
- Added 'sql_big_table_option' configuration parameter for performing
  huge queries. Default is true.
- Fixed the bug regarding HTTP persistent connections with a preemptive
  HEAD call before the GET.
- HTTP redirections are now treated as special links and stored into
  the link table with a 'Redirection' LinkResult flag.
- Referer management now is done right.
- Hop count management and storing added.
- Added 'max_hop_count' configuration parameter for limiting the crawl
  to a certain distance from the starting URL.
- PHP Interface:
  - The configure and make system has been modified in order to manage the
    php scripts. A new configuration option has been issued (--with-php-dir=DIR)
    and the make install procedure now look after the scripts too.
  - Page for querying the links retrieved, with a form which we
    can set filters through, regarding both the source and the
    destination URLs (with like and not like SQL statements);
  - Page for dropping a database.
  - Italian language added (include/italian.inc - See the INSTALL file)


Release notes for htcheck-1.1.0b2-utero - 08 Aug 2000
- A simple PHP interface has been added. You need PHP (either as a
  standalone CGI interpreter or - if you have Apache - as an Apache
  module) compiled with the mysql add-on module. For its installation
  look at the INSTALL file.
- The 'Link' table contains another field, the 'Anchor': its purpose
  is to store the 'token' after the '#' char in a link (for example in
  <A href="URL#anchorname">, it contains 'anchorname).

Release notes for htcheck-1.1.0b1-utero - 12 May 2000
A more stable version, but tested only on a RedHat 6.x system (see README file).
This new features have been added:
- Now it's possible to determine if a link is normal (like A href ones), that is
  to say the user has to click in order to get it, or is direct (like IMG src)
  that is to say it's automatically loaded (potentially) by the user's browser.
- Added a field to the Url table which contains the size to be added at load
  time in order to obtain the total weight of the document: it contains the sum


Release notes for htcheck-1.1.0b-utero - 5 May 2000
This is the very first release. It can be used for checking broken links.
Here are the main features:
- Access to a MySQL database (in this form: user@localhost, where user
  is the PID owner).
- HTTP 1.1 connections working with persistent connections choose
- At the end, show of broken links, servers seen and content-types encountered.
- Creation of these tables in the database: Url, Server, Link, Schedule,
  HtmlStatement, HtmlAttribute.


Next Previous Contents