W3C home > Mailing lists > Public > public-html-bugzilla@w3.org > November 2010

[Bug 11057] doctype about:legacy-compat, t

From: <bugzilla@jessica.w3.org>
Date: Thu, 04 Nov 2010 17:53:10 +0000
To: public-html-bugzilla@w3.org
Message-Id: <E1PE3zq-0004KL-T0@jessica.w3.org>

--- Comment #7 from David Carlisle <davidc@nag.co.uk> 2010-11-04 17:53:09 UTC ---
(In reply to comment #6)
> I agree with Julian. If this is incompatible with XML,

well it's not incompatible with XML, just needs to be used with care.

specifically I think if the XML parser tries to de-reference it the document
will fail with a fatal error as not well formed.

however an XML parser may (and ones in browsers typically do) not fetch
external entities and if they don't fetch them they do not have to report
errors in what they have not seen.

So it depends why the user is choosing to use the xml syntax:

Iif the file is just the end result of an xml pipeline then using
about:legacy-compat is OK, but so is more or less any doctype which produces
standards mode in html (even if it's an sgml not an xml dtd).

If on the other hand the user is using the xml syntax because they want
(someone) to be able to use the file as -input- to an xml pipeline then
probably some words of advice ought to be given as most xml parsers (rxp,
xerces, msxml?) would fail to parse such a file out of the box and would need
to be configured (eg with a catalog) to do something safe with the dtd, or not
to resolve it. 

> we should raise a bug
> against the HTML5 spec.

the html5 spec merely says that the about: URI is to make it easier for xslt,
to generate it, and for that limited use, it s OK. By implication it is also
saying that xml parsers within browsers will not derefernce this SYSTEM id.

> Of course it was also mentioned that this is arguably not a problem, but is
> inconsistent with the text for HTML3.2 and HTML4 doctypes. Would expanding the
> definition of allowed doctypes to include those as well be a reasonable
> resolution to this bug?

there is a kind of logic to that which appeals to me as a mathematician, but
practically speaking I don't really think that we/you should be advising people
to do that.

possibly just add, in the note at the end of section 4 something like...

Also note that when using an XML parser to parse a document using the
about:legacy-compat the parser must be configured not to deference this URI (as
that will fail and cause a parse error). The XML parsers used by web browsers
are usually configured this way by default, but other XML processing pipelines
may not be.

except that wording (which I just made up now) is probably too long (and also
not technically accurate) for example in java your parser may deference the uri
if you have installed a URIresolver that special cases this and returns
something safe (eg an empty string rather than an error). Basically you need
_something_ to avoid trying to fetch the about:legacy-compat, but there are
various layers where that redirection can occur.  I think any kind of note that
hints that about: URIs need to be used with care in xml processing pipelines
would be sufficient, it's probably best to avoid the details of exactly what
care is needed, since it's rather dependent on the processing framework being

Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
Received on Thursday, 4 November 2010 17:53:12 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 16:31:00 UTC