- From: <bugzilla@jessica.w3.org>
- Date: Thu, 18 Feb 2016 12:27:05 +0000
- To: public-qt-comments@w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=29479 Bug ID: 29479 Summary: [XSLT30] Streaming and non-well-formed documents Product: XPath / XQuery / XSLT Version: Candidate Recommendation Hardware: PC OS: Windows NT Status: NEW Severity: normal Priority: P2 Component: XSLT 3.0 Assignee: mike@saxonica.com Reporter: abel.braaksma@xs4all.nl QA Contact: public-qt-comments@w3.org Target Milestone: --- Martin Honnen brought this to my attention in a bug report on Exselt (ECS-12). Het quoted a part of the spec: "A streamed transformation that only accesses part of the input document (for example, a header at the start of a document) is not required to continue reading once the data it needs has been read. This means that XML well-formedness or validity errors occurring in the unread part of the input stream may go undetected." and gave this example: <?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="3.0" xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="xs"> <xsl:param name="input-uri" as="xs:string" select="'test201602170101.xml'"/> <xsl:param name="items-to-copy" as="xs:integer" select="4"/> <xsl:variable name="children-to-copy" as="xs:integer" select="$items-to-copy + 1"/> <xsl:mode streamable="yes"/> <xsl:output indent="yes"/> <xsl:template name="xsl:initial-template"> <xsl:stream href="{$input-uri}"> <xsl:apply-templates/> </xsl:stream> </xsl:template> <xsl:template match="/*"> <xsl:copy> <xsl:iterate select="*"> <xsl:copy-of select="."/> <xsl:if test="position() eq $children-to-copy"> <xsl:break/> </xsl:if> </xsl:iterate> </xsl:copy> </xsl:template> </xsl:stylesheet> with the following input: <root> <header>...</header> <item name="1">...</item> <item name="2">...</item> <item name="3">...</item> <item name="4">...</item> <item> </root> This input is deliberately not well-formed. He ran the example with Saxon as well, which threw no error. My product threw a rather unclear internal error which is clearly a bug. However, this shows a peculiar situation that may arise with non well-formed documents. I would challenge that in this case the error can be ignored, because the xsl:copy is shallow-copying the <root> element. To complete that copy it needs to read through to the end. If the template were written differently, this error may not need to arise: <xsl:template match="/*"> <xsl:element name="{name()}"> <xsl:iterate....> </xslelement> </xsl:template> But even then, whether or not an error is raised will be entirely implementation dependent. I am wondering if we can make this more interoperable. For instance by requiring an option to at least through to the end. This will not always be feasible, hence it must be a user option, but one that a processor *must* support. Conversely, how much a processor looks ahead before it "breaks" further processing (recall that <xsl:break> is not a real break, it just skips over the next items, it doesn't mean that these items should not be processed) is implementation defined, but I wonder if we could be more prescriptive about where and when a processor is really allowed to skip further processing of a document. The main use-case for adding the line above is for when a user is interested only in a certain leaf node, or existence of one, and further processing is not needed. The problem is: can we define when "further processing is not needed"? -- You are receiving this mail because: You are the QA Contact for the bug.
Received on Thursday, 18 February 2016 12:27:08 UTC