- From: David Carlisle <davidc@nag.co.uk>
- Date: Thu, 14 Jul 2011 21:08:34 +0100
- To: www-tag@w3.org
> In practice, it leaves open a number of questions, which I think need > to be addressed: > > 1) Why 'should' and not 'must'? > > If ensuring interop is the goal here, surely we want user agents > all to just _do_ this. . . I wasn't involved in this bit, but if you make it a must then an off the shelf conformant xml parser wouldn't be able to parse xhtml in a conformant way, which might be a bit odd. > > 2) Why not a number of other public identifiers? > > For example, -//W3C//DTD XHTML Basic 1.0//EN > -//W3C//DTD SVG 1.0//EN > -//W3C//DTD SVG 1.1//EN > -//W3C//MathML 1.0//EN Personally I think that an xhtml related xml parer ought to use the same entity set for _all_ xml _all_ the time. So that you could finally have a spec for passing around fragments of xml like <span> </span> without it being not well formed. Failing that, including at least the standard html declaration <!DOCTYPE html> <html>... would be useful. > > 3) What exactly is that list of entities? the list from the w3c entities spec [1], specifically the htmlmathml list [2] > How would I know if there > was a mistake of omission? (I must compare with [2] that as part of my last call review of html5) > > 4) What about the _internal_ subset? Should it be processed > (consistent with the catalog story) or not (consistent with what > the XML spec. says processors may do, since the external subset is > "a special kind of external entity", and non-validating XML > processor may stop 'processing' the internal subset once they > choose not to read an external entity)? I think internal subsets should be parsed and entities within them defined, and authors encouraged not to use them, for compatibility with html. > > 5) What if the XML declaration for the document at hand includes > "standalone='no'" (or no standalone, which the XML spec. requires > to be interpreted as 'no')? I've managed to ignore standalone for over a decade, so ignoring forever wouldn't trouble me:-) > > (Note that as it stands Polyglot [2] does not allow either an XML > declaration or an internal subset). It also doesn't allow entity references apart from the predefined xml ones, this restriction could be dropped if the full entity set were implied by <!DOCTYPE html> > > It seems to me the interoperability of existing XHTML toolchains and > HTML5 user agents is implicated by one or more of the above -- what > should the TAG say, and to whom? Should the TAG and the XML > Processing Model WG work together to define a Processor Profile [3] > which could be referenced normatively in section 9.2 of the HTML5 > spec.? > In a follow-up message: > For sure, I should have included that as well -- either of the fixed > lists in this section may need updating quite regularly. . . As Anne commented we have traditionally strongly resisted adding or removing any entities because the story (on the XML side) is so drastic if your catalog switches in a dtd with a different entity set, a single undefined character renders the entire document not well formed. The xml entities spec covers all the entities that have been published by W3C or ISO and as far as I recall there has only been one new name added since MathML 1 in 1998, so adding names has not traditionally been a regular occurrence. The other "fixed list" that you refer to, the list of URIs that trigger entities, I think that should be infinitely extended, as I note above. David [1] http://www.w3.org/TR/2010/REC-xml-entity-names-20100401/ [2] http://www.w3.org/2003/entities/2007/htmlmathml-f.ent
Received on Thursday, 14 July 2011 20:09:09 UTC