Q: | How do I include an external document fragment in my XML document? |
---|---|
A: | The proper way to do
this is using an external entity. First, declare the entity in the DTD and set it to point to the external document fragment. Then, reference the entity where the external fargment is to be included. <?xml version="1.0"?> <!DOCTYPE root [ <!ENTITY include SYSTEM "entity-uri.xml"> ]> <root> <sub> . . . </sub> &include; </root> Note that the included document fragment is an external entity, and must conform to specific rules to be properly parsed.
|
Q: | Is there a difference between a standard XML document and an external entity? |
A: | An external entity is
used to contain a document fragment, not a full document. That is the main difference
between it and a proper XML document. Because the external entity is only a document fragment, it cannot have an external DTD reference or internal DTD subset, and does not require to have a root element. It will be parsed under the validation root of the larger XML document in which is it included. The external entity is parsed when it is referenced in a proper XML document. For that reason, it must express its character encoding in the document declaration. <?xml version="1.0" encoding="ascii"?> This is an external entity <break type="para"/> No DTD, no root element, specific document encoding.
|
Q: | What are the family relationships between XML, HTML and XHTML? |
A: | XML and HTML are both
descendants and subsets of SGML, the Standard Generic Markup Language. HTML was born to serve the World Wide Web, by employing a small and fairly simple to implement subset of the larget SGML. XML followed HTML but was generalized to convey information in general. As such, it's structure is simpler, more generic and more rigid than SGML. XHTML is an expression of HTML in an XML document. An XHTML document can be read by a Web browser and parsed by an XML parser.
|
Q: | What are the distinguishing differences between XML, HTML and XHTML? |
A: | The XML format is a
very generic format, used to expressed various types of information in varying forms. For
that it must be very strict in its structure, support a DTD that changes with each
document type, and cannot assume anything about its contents. The HTML format is defined for a specific purpose and as such can make certain assumptions about its contents. For example, the IMG tag is known to be empty, P may follow a paragraph. Because all HTML have a single set of elements with a well known use, the browser can read badly formatted documents and display them correctly. HTML: <IMG src="...">Text and image in one paragraph<P> XML: <P><IMG src="..."/>Text and image in one paragraph</P> Most HTML documents are poorly formatted, and although a Web browser is capable of displaying them properly, an XML parser is not even capable of reading them. XHTML, also known as HTML-in-XML, is a cross format that allows documents to be parsed and processed by existing XML applications, and properly displayed by Web browsers. HTML: <PRE>Formatted text</PRE><BR> XHTML: <pre xml:space="preserve">Formatted text</pre><br /> Note that an XML document is not necessarily an XHTML document. XML would generally print the BR element as <br/>, while XHTML would print it as <br />. The extra space prevents some HTML editors and browsers from rejecting the element.
|
Q: | Why can't I see the XML document in the Web browser? |
A: | Most Web browsers are
not capable of displaying XML documents at all. Such browsers can only display HTML
documents. Web browsers have no notion of how to display the different elements that make up an XML document. The XML document must be transformed into a format that they are familiar with, such as HTML. If the XML document is already formatted using HTML elements and HTML structure, it must specifically be printed in either HTML or XHTML format. Most Web browsers will fail to recognize the document if it begins with the typical XML document declaration.
|
Q: | How to best print XML documents for view in a Web browser? |
A: | This question depends
largley on what the XML document is required to convey. If the XML document is to be downloaded by the browser, read by an Applet or browser component (e.g. an RDF resource descriptor), it should be published in the XML format. If the XML document is to be displayed as a Web page, it must be transformed into the language of Web pages, HTML. The browser will not understand your XML information unless it comes inside HTML elements, such as P, H1 and TR. The transformation from XML to HTML is done using XSL stylesheets. The transformation can be done directly in the Web browser, for Web browsers that support XML and XSL. Since most Web browsers do not offer this level of support, it is recommended to perform the transformation on the server side, converying the XML information document into an HTML presentation document before delivering it to the browser. If the XML document already contains HTML elements and structure or has been converted to such through XSL transformation, it can be published in one of two formats. Generally XHTML is the recommended format as it assures portability between Web browsers and XML parsers. However, not all Web browsers and HTML editors will properly digest XHTML documents. As such, it might be recommended to use HTML instead.
|