org.openxml.parser
Class XMLParser

java.lang.Object
  |
  +--org.openxml.parser.BaseParser
        |
        +--org.openxml.parser.ContentParser
              |
              +--org.openxml.parser.XMLParser

public class XMLParser
extends org.openxml.parser.ContentParser

Implements a parser for XML documents, document fragments, nodes and external entities. The XML document is created with DOMFactory, but a specific class type may be requested with parseDocument(DTDDocument,Class).

Stuff to do:

Version:
$Revision: 1.10 $ $Date: 1999/04/18 01:52:13 $
Author:
Assaf Arkin
See Also:
ContentParser, SAXException

Fields inherited from class org.openxml.parser.ContentParser
_currentNode, _docType
 
Fields inherited from class org.openxml.parser.BaseParser
_curChar, _document, _tokenText, CR, EOF, LF, SPACE, TOKEN_CDATA, TOKEN_CLOSE_TAG, TOKEN_COMMENT, TOKEN_DTD, TOKEN_ENTITY_REF, TOKEN_EOF, TOKEN_OPEN_TAG, TOKEN_PE_REF, TOKEN_PI, TOKEN_SECTION, TOKEN_SECTION_END, TOKEN_TEXT
 
Constructor Summary
XMLParser(BaseParser owner, java.io.Reader reader, java.lang.String sourceURI)
          Constructor for entity parser.
XMLParser(java.io.Reader reader, java.lang.String sourceURI)
          Parser constructor.
XMLParser(java.io.Reader reader, java.lang.String sourceURI, short mode, short stopAtSeverity)
          Parser constructor.
 
Method Summary
protected  boolean closingTag(java.lang.String name, boolean keepQuite)
          Called to process the closing tag.
 Document parseDocument()
           
 Document parseDocument(DTDDocument dtd)
          Parses and returns a new XML document.
 Document parseDocument(DTDDocument dtd, java.lang.Class docClass)
          Parses and returns a new XML document.
protected  void parseDTDSubset(boolean standalone)
          Parser the internal and external DTD subsets.
 Entity parseEntity(org.openxml.dom.EntityImpl entity, boolean internal)
          Parses the external/internal entity and places its contents underneath the entity node.
protected  boolean parseNextNode(int token)
          Parses the next node based on the supplied token.
 Node parseNode(Node node)
           
 
Methods inherited from class org.openxml.parser.ContentParser
getEntityContents, parseAttrEntity, parseAttributes, parseContentEntity, readTokenContent
 
Methods inherited from class org.openxml.parser.BaseParser
advanceLineNumber, canReadName, close, error, fatalError, getColumnNumber, getErrorHandler, getErrorReport, getLastException, getLineNumber, getLocator, getMode, getPublicId, getReader, getSourcePosition, getSourceURI, getSystemId, isClosed, isMode, isNamePart, isSpace, isTokenAllSpace, parseDocumentDecl, parseGeneralEntity, pushBack, pushBack, readChar, readTokenEntity, readTokenMarkup, readTokenName, readTokenPERef, readTokenQuoted, setEncoding, setErrorHandler, setErrorSink, slicePITokenText, warning
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

XMLParser

public XMLParser(java.io.Reader reader,
                 java.lang.String sourceURI,
                 short mode,
                 short stopAtSeverity)
Parser constructor. Requires source text in the form of a Reader object and as an identifier. The parsing mode consists of a combination of MODE_.. flags. The constructor specifies the error severity level at which to stop parsing, either Parser.STOP_SEVERITY_FATAL, Parser.STOP_SEVERITY_VALIDITY or Parser.STOP_SEVERITY_WELL_FORMED.
Parameters:
reader - Any Reader from which entity text can be read
sourceURI - URI of entity source
mode - The parsing mode in effect
stopAtSeverity - Severity level at which to stop parsing

XMLParser

public XMLParser(BaseParser owner,
                 java.io.Reader reader,
                 java.lang.String sourceURI)
Constructor for entity parser. Requires a parent parser to be specified and will use that parser's document, dtd, error sink and mode (assuming that Parser.MODE_PARSE_ENTITY is in effect). The severity level is set to Parser.STOP_SEVERITY_FATAL.
Parameters:
owner - The parser which invoked this parser
reader - Any Reader from which entity text can be read
sourceURI - URI of entity source

XMLParser

public XMLParser(java.io.Reader reader,
                 java.lang.String sourceURI)
Parser constructor. Constructor will operate in the default mode of Parser.MODE_XML_PARSER with Parser.STOP_SEVERITY_FATAL.
Parameters:
reader - Any Reader from which entity text can be read
sourceURI - URI of entity source
Method Detail

parseDocument

public Document parseDocument()
                       throws SAXException

parseDocument

public Document parseDocument(DTDDocument dtd)
                       throws SAXException
Parses and returns a new XML document. The input stream is assumed to contain a valid document. The parsing behavior depends in much on mode selected in the constructor. A default DTD is specified and if the document specifies no internal or external DTD, this DTD is used for entity resolving and for validation.

Depending on the parsing modes, some parsing errors might cause an exception to occur, others will be stored and later accessible with the BaseParser.getLastException() method. I/O exceptions and runtime exceptions will terminate parsing immediately by throwing a FatalSAXException.

Parameters:
dtd - The default DTD to use, if one not specified in the document, or null
dtd - The default DTD to use, if one not specified in the document
Returns:
The parsed XML document
Throws:
SAXException - A parsing error has been encountered, and based on it severity, an exception is thrown to terminate parsing
See Also:
DTDDocument

parseDocument

public Document parseDocument(DTDDocument dtd,
                              java.lang.Class docClass)
                       throws SAXException
Parses and returns a new XML document. The input stream is assumed to contain a valid document. The parsing behavior depends in much on mode selected in the constructor. A default DTD is specified and if the document specifies no internal or external DTD, this DTD is used for entity resolving and for validation.

Depending on the parsing modes, some parsing errors might cause an exception to occur, others will be stored and later accessible with the BaseParser.getLastException() method. I/O exceptions and runtime exceptions will terminate parsing immediately by throwing a FatalSAXException.

If not null, docClass specifies the class for the created XML document. That class must extend XMLDocument.

Parameters:
dtd - The default DTD to use, if one not specified in the document, or null
xmlClass - The class for the document object, or null
Returns:
The parsed XML document
Throws:
SAXException - A parsing error has been encountered, and based on it severity, an exception is thrown to terminate parsing
See Also:
DTDDocument, XMLDocument

parseNode

public final Node parseNode(Node node)
                     throws SAXException

parseEntity

public final Entity parseEntity(org.openxml.dom.EntityImpl entity,
                                boolean internal)
                         throws SAXException
Parses the external/internal entity and places its contents underneath the entity node. The document might be an internal or external entity (set by internal). After the document has been parsed, the parser is closed and the entity is returned as read-only.

Depending on the parsing modes, some parsing errors might cause an exception to occur, others will be stored and later accessible with the BaseParser.getLastException() method. I/O exceptions and runtime exceptions will terminate parsing immediately by throwing a FatalSAXException.

Parameters:
entity - The entity to parse
internal - True if an internal entity
Returns:
The entity node
Throws:
SAXException - A parsing error has been encountered, and based on it severity, an exception is thrown to terminate parsing

parseNextNode

protected final boolean parseNextNode(int token)
                               throws SAXException,
                                      java.io.IOException
Parses the next node based on the supplied token. This method is called with a read token, parses a node and appends it to ContentParser._currentNode. If plain text is read, it is accumulated and later on converted into a Text. If the node is an element, the element is created and it's full contents read (recursively).

The return value indicates if the current element (in ContentParser._currentNode) has been closed with a closing tag (false), or should parsing continue at the same level (true). False is also returned if the end of file has been reached.

The following rules govern how tokens are translated into nodes:

The proper way to use this method is:
 _currentNode = ...;
 token = readTokenContent();
 while ( parseNextNode( token ) )
     token = readTokenContent();
 
Parameters:
token - The last token read with ContentParser.readTokenContent()
Returns:
True if continue parsing, false if current element has been closed or reached end of file
Throws:
SAXException - A parsing error has been encountered, and based on it severity, an exception is thrown to terminate parsing
java.io.IOException - An I/O exception has been encountered when reading from the input stream
See Also:
ContentParser.parseAttributes(org.w3c.dom.Element, boolean), ContentParser.readTokenContent(), ContentParser._currentNode, #_orphanClosingTag

closingTag

protected final boolean closingTag(java.lang.String name,
                                   boolean keepQuite)
                            throws SAXException
Called to process the closing tag. This method is isolated from parseNextNode(int) and called on two occassions: when the closing tag is met, and when an orphan closing tag has been identified. In the first instance, it is called with keepQuite false, issuing an error if an orphan closing tag is met. In the second instance, the orphan closing tag is processed and so keepQuite is true.
Parameters:
name - The tag name
keepQuite - True if orphan closing tag should not issue an error
Returns:
True if closing tag matches current element in ContentParser._currentNode
Throws:
SAXException - A parsing error has been encountered, and based on it severity, an exception is thrown to terminate parsing

parseDTDSubset

protected final void parseDTDSubset(boolean standalone)
                             throws SAXException,
                                    java.io.IOException
Parser the internal and external DTD subsets. A new DTDDocument is created, the internal subset is parsed into it, followed by the external subset. Errors produced when parsing the DTD are directed to this parser. If there is no internal subset, the external subset is optinally cached in memory. Public identifiers may be converted to URIs, as per the installed HolderFinder.

This method is called after '<!DOCTYPE' has been consumed and returns after the terminating '>' has been read. The standalone flag is passed from the '<xml' PI processing.

Parameters:
standalone - The standalone flag