- From: Henry S. Thompson <ht@inf.ed.ac.uk>
- Date: Fri, 05 Dec 2008 11:28:40 +0000
- To: Ian Hickson <ian@hixie.ch>
- Cc: noah_mendelsohn@us.ibm.com, Boris Zbarsky <bzbarsky@MIT.EDU>, public-html <public-html@w3.org>, www-tag@w3.org
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Ian Hickson writes:
> Which document describes the required processing of the document that 
> Boris mentioned to obtain interoperability on the issue he mentioned?
>
> Specifically, what the rendering of this URL should be, and whether there 
> should be any bold text:
>
> data:text/xml,%3C?xml-stylesheet%20href=%22data:text/css,*{font-weight:bold}%22?%3E%3Croot%3Etext%20%3Couter%3Eouter%20%3Cinner%3Einner%3C/outer%3E
So, let's look at this step by step.
I take it the above data: URI corresponds to the following character
stream, to be interpreted as text/xml (whitespace added for readability):
  <?xml-stylesheet href="data:text/css,*{font-weight:bold}"?>
  <root>text <outer>outer <inner>inner</outer>
The XML spec. says
 a) Well-formedness Constraint: Start tags must match end tags;
 b) Violations of well-formedness constraints is a fatal error;
 c) Fatal errors and their handling are defined as follows:
   "[Definition: An error which a conforming XML processor MUST
    detect and report to the application. After encountering a fatal
    error, the processor MAY continue processing the data to search
    for further errors and MAY report such errors to the
    application. In order to support correction of errors, the
    processor MAY make unprocessed data from the document (with
    intermingled character data and markup) available to the
    application. Once a fatal error is detected, however, the
    processor MUST NOT continue normal processing (i.e., it MUST NOT
    continue to pass character data and information about the
    document's logical structure to the application in the normal
    way).]"
 d) 'XML processor' and 'application' are terms of art, defined as
    follows:
   "[Definition: A software module called an XML processor is used to
    read XML documents and provide access to their content and
    structure.]
   
   "[Definition: It is assumed that an XML processor is doing its
    work on behalf of another module, called the application.]
   "This specification describes the required behavior of an XML
    processor in terms of how it must read XML data and the
    information it must provide to the application."
In the case at hand, I take it from the form of your question that we
are to understand the application to be a browser or other rendering
engine which is conformant to the XML stylesheet and CSS specs.
Both of these specifications reference the XML spec., and use the
phrase "XML document" to describe what they apply to/what their
semantics is.
Fortunately, the XML spec. defines this term as well:
   "[Definition: A data object is an *XML document* if it is
    well-formed, as defined in this specification. In addition, the
    XML document is valid if it meets certain further constraints.]"
Net-net: In the example you offer, we have a data object which is not
well-formed, in terms of the XML specification.  This is a 'fatal
error', and MUST be signalled as such to the application.
Furthermore, as the above makes clear, the data object you offer in
your example is _not_ an XML document.  Neither the CSS spec. nor the
XML stylesheet spec. specify the behaviour of conforming processors
when given data objects to process which are not XML (or HTML)
documents.
So it appears to me that if you (or Boris) have a grievance, it is
with the CSS and/or the XML Stylesheet specs, not the XML spec.  But I
have to say I don't think you have _much_ of a grievance -- it's not
unusual, or particularly unreasonable, for specs to express
conformance in positive terms: that is, to say that conformance means
"if you get [some kind of input], you behave in [some kind of way]."
What you do in other circumstances is not constrained _for conformance
to the spec. in question_.
That doesn't mean that guidelines for helpful behaviour, probably
sub-classed for what in another thread you referred to as conformance
classes, for _consumers_ of XML, i.e. applications, might not be a
good idea.  So for sure, as a specification for consumers of XML, the
HTML5 spec. can and probably should say what happens if an XML
processors signals a fatal error and stops "pass[ing] character data
and information about the document's logical structure to the
application in the normal way."
The XML spec. also leaves it open to application specs. to go further.
Because the XML spec. says that after a fatal error
  "the [XML] processor MAY continue processing the data to search for
   further errors and MAY report such errors to the application. In
   order to support correction of errors, the processor MAY make
   unprocessed data from the document (with intermingled character
   data and markup) available to the application",
an application spec. may _require_ conformant implementations to use
only XML processors which _do_ all the things listed above which a
processor MAY do, and go on to specify what kind of behaviour then
ensues.
Other applications may prefer to say "Well-formed XML documents are
handled [like this], anything else is an error and will not be
processed."
Surely both these approaches are reasonable, and therefore the XML
spec. is correct to leave both options open to application designers.
Given its nature as a meta-language, and its consequent place in
implementation stacks, surely to do anything else would have been a
mistake.
ht
- -- 
       Henry S. Thompson, School of Informatics, University of Edinburgh
                         Half-time member of W3C Team
      10 Crichton Street, Edinburgh EH8 9AB, SCOTLAND -- (44) 131 650-4440
                Fax: (44) 131 651-1426, e-mail: ht@inf.ed.ac.uk
                       URL: http://www.ltg.ed.ac.uk/~ht/
[mail really from me _always_ has this .sig -- mail without it is forged spam]
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)
iD8DBQFJORB5kjnJixAXWBoRAh96AJ42jOilefeX2leDecOm8gs/SVyjrwCbBuJM
15kjrxUKh1azvgCVheLOfnM=
=vRmx
-----END PGP SIGNATURE-----
Received on Friday, 5 December 2008 11:30:10 UTC