[Bug 4372] [Serialization] Lexical checking of doctype-public

http://www.w3.org/Bugs/Public/show_bug.cgi?id=4372

           Summary: [Serialization] Lexical checking of doctype-public
           Product: XPath / XQuery / XSLT
           Version: Recommendation
          Platform: PC
        OS/Version: Windows XP
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Serialization
        AssignedTo: scott_boag@us.ibm.com
        ReportedBy: mike@saxonica.com
         QAContact: public-qt-comments@w3.org


Bjoern Hoehrmann [derhoermi@gmx.net]raised the following point today on
public-qt-comments. I am transferring it here for tracking purposes. Please
ensure that any decisions are relayed to Bjoern!

Dear XSL Working Group,

  In http://www.w3.org/1999/11/REC-xslt-19991116-errata/ E4 XSLT 1.0 processors
are required to generate well-formed XML documents. I think this erratum is
incomplete (the last sentence of the first paragraph in
3.1 would also need to be changed, and arguably also the first one in
16.1) and I do not think processors can implement the requirement. In XSLT 2.0
and XSLT 2.0 and XQuery 1.0 Serialization a similar issue exists.

The reason is that neither version of XSLT requires lexical checking of the
doctype-public parameter, both specify the content model as just "string", but
XML 1.0 places additional restriction on it. For example,

  <xsl:output
    method="xml"
    version="1.0"
    doctype-system="x"
    doctype-public="-//W3C//DTD&#x9;XHTML 1.0 Transitional//EN"
  />

or

  <xsl:output
    method="xml"
    version="1.0"
    doctype-system="x"
    doctype-public="x&#xf6;y"
  />

would result in ill-formed XML as neither U+0009 nor U+00F6 are allowed in the
public identifier. In case of XSLT 1.0 it seems processors are not allowed to
signal an error in this case, and in case of XSLT 2.0 it can be argued that
this should result in the generic err:SERE0003 error, but e.g. Saxon 8.7.1J
emits ill-formed XML instead. I think both XSLT 1.0 and XSLT 2.0 should require
doctype-public to be syntactically correct, or failing that, XSLT 1.0's E4
should be modified to allow the processor to signal an error in the cases
above.

regards,
--
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de Weinh.
Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/

Received on Wednesday, 7 March 2007 09:02:29 UTC