- From: <bugzilla@wiggum.w3.org>
- Date: Thu, 15 Mar 2007 18:20:09 +0000
- To: public-qt-comments@w3.org
- CC:
http://www.w3.org/Bugs/Public/show_bug.cgi?id=4372 ------- Comment #2 from mike@saxonica.com 2007-03-15 18:20 ------- The relevant rules for XML appear to be: [12] PubidLiteral ::= '"' PubidChar* '"' | "'" (PubidChar - "'")* "'" [13] PubidChar ::= #x20 | #xD | #xA | [a-zA-Z0-9] | [-'()+,./:=?;!*#@$_%] and I think it's fairly straightforward for us to add a rule to the serialization spec that says it's an error if doctype-public doesn't conform to this syntax. The more difficult question is what to do about HTML. In principle we could require that the doctype-public is one of the official FPIs appearing in the HTML recommendation, for example "-//W3C//DTD HTML 4.01//EN". However, that would almost certainly break a lot of existing stylesheets, since there's almost certainly a lot of code getting away with undetected typos in such a string. Arguably XSLT processors should tell people when they are generating bad HTML, but I personally don't want to be the one in the firing line on this: although we could have done it earlier, it's a bad candidate for an erratum. Also, it's not future-proof: we don't know what FPIs will be allowed in future versions of HTML. I think my preference would be that we impose the same rules for HTML as we do for XML - that is, a simple restriction on the permitted character set.
Received on Thursday, 15 March 2007 18:20:28 UTC