- From: Jeremy Carroll <jjc@hpl.hp.com>
- Date: Fri, 30 Mar 2007 13:26:05 +0100
- To: GRDDL Working Group <public-grddl-wg@w3.org>
First attempt at more text on validity. In some sense, more is less! The current text says little, but says what is formally required. My text expands on that, with the hope of being more useful, but perhaps drifts into being more confused. I'll follow up, with a shorter version (less explanation) and see what people think. Jeremy After the following text in http://www.w3.org/TR/2007/WD-grddl-20070302/#txforms [[ Therefore, it is suggested that GRDDL transformations be written so that they perform all expected pre-processing, including processing of related DTDs, Schemas and namespaces. Such measure can be avoided for documents which do not require such pre-processing to yield an infoset that is faithful. That is, for documents which do not reference XInclude, DTDs, XML Schemas and so on.</p> ]] I suggest the following: [[ <p> To be more specific concerning XML Validation. GRDDL aware agents may use either validating or non-validating XML processors, (see section 5.1 of [<a class="norm" href="#XML">XML</a>]), or even some mix of validating and non-validating XML processors. Thus, document authors should avoid reliance on an external <a href="http://www.w3.org/TR/2006/REC-xml-20060816/#dt-doctype" >DTD subset</a>. This can be achieved by, for example, following the rules specified in <a href="http://www.w3.org/TR/2006/REC-xml-20060816/#vc-check-rmd"> the standalone document</a> validity constraint. If all these rules are followed, then adding a: <pre> <code>standalone="yes"</code> </pre> on such XML documents, may reduce the cost of processing with some GRDDL aware agents. In practice, for GRDDL, these rules can be applied only in part, depending on knowledge of the licensed GRDDL transforms. The two issues most likely to cause problems are: </p> <ul> <li> The use of a default attribute to set the namespace. This occurs, for example, with the XHTML DTDs. It is usually ambiguous to use GRDDL with a document which references an XHTML DTD, and does not include an explicit namespace declaration, for example: <pre> <html> </pre> rather than <pre> <html xmlns="http://www.w3.org/1999/xhtml"> </pre> See, tests (@@@TODO, TODO, TODO) which explore this case. </li> <li> The XPath Node set is not well-defined for documents including references to external entities, except via the external DTD, and so neither are the rules for GRDDL. In particular, a non-validating XML processor, that does not read the external DTD subset, if any, cannot reliably compute GRDDL results, when any external entity reference occurs in some part of an XML document relevant to GRDDL processing, e.g. within the value of a <code>rel</code> attribute within an XHTML family document, or within element content corresponding to an XPath text node, that is processed as part of a GRDDL transform of the document. Even when a reference to an external entity occurs in other places in an XML document, document authors should have no expectation of interoperable GRDDL processing. Permitted behaviour for an XSLT engine, using a non-validating XML processor, is to raise an unrecoverable error in such a situation. </li> </ul> <p> In summary, document authors, particularly XHTML document authors, wishing their documents to be used with GRDDL, are encouraged: </p> <ul> <li> To always explicitly include the XHTML namespace in an XHTML document, or an appropriate namespace in an XML document. </li> <li> To avoid use of entity references, except those listed in <a href= "http://www.w3.org/TR/2006/REC-xml-20060816/#sec-predefined-ent"> section 4.6</a> of [<a class="norm" href="#XML">XML</a>] </li> </ul> ]] -- Hewlett-Packard Limited registered Office: Cain Road, Bracknell, Berks RG12 1HN Registered No: 690597 England
Received on Friday, 30 March 2007 12:26:24 UTC