- From: Grosso, Paul <pgrosso@ptc.com>
- Date: Thu, 11 Jun 2009 10:50:12 -0400
- To: <public-xml-core-wg@w3.org>
I see there are some things I missed, so it's good that we're having some more discussion. Comments below. > -----Original Message----- > From: Simon Pieters [mailto:simonp@opera.com] > Sent: Thursday, 2009 June 11 3:27 > To: Grosso, Paul; public-xml-core-wg@w3.org > Subject: Re: xml-stylesheet issues--suggested resolutions > > On Wed, 10 Jun 2009 16:28:57 +0200, Grosso, Paul > <pgrosso@ptc.com> wrote: > > >> > * What happens when the PI is XML 1.0-well-formed but > >> doesn't follow the > >> > xml-stylesheet syntax? > >> > >> > * What happens when there are duplicate pseudo-attributes? > >> (This seems > >> > to actually be allowed in the syntax.) > > > > I suggest: > > > > This is an error; the xml-stylesheet processor MAY ignore the > > entire PI; if it tries to recover, it SHOULD ignore all but the > > last assignment to a given pseudo-attribute. > > > > This is what Arbortext currently does, and if we change the spec > > to say "MUST ignore the entire PI", and we change our code to be > > compliant, some user documents will suddenly stop working. If we > > don't change our code, then we would be non-compliant which looks > > bad both for Arbortext and the AssocSS spec (because it generally > > looks bad for a spec when implementors ignore it). > > > > In fact, from an XML Core point of view, I'm less worried about > > what Arbortext does than what the "major browser vendors" do. > > If we start changing the AssocSS spec to make current behavior > > completely non-compliant, I'm quite sure there will be cases > > (such as this one in Arbortext's case) where they will decide > > they just can't invalidate existing documents, so they will > > ignore the spec. I'd rather not be in the position of setting > > ourselves up to be ignored. > > I doubt there is enough legacy content with invalid > xml-stylesheet PIs to > make browser vendors ignore the spec. I say this because there are > surprisingly few bugs reported on Opera for our Draconian > handling of invalid xml-stylesheet PIs. > > There's one bug that cites this test case: > > > http://home.arcor.de/martin.honnen/operaBugs/op9/XML/ampersandInPI2.xml > > The bug says that Opera is wrong in aborting parsing. I think > Firefox, Safari and IE ignore the PI here. This is a reasonable discussion to have, and I'd like to hear what others think. My basic concern remains--if we are too strict, we risk being ignored. If we can convince ourselves--or get assurances from implementors--that we won't get ignored, then we can perhaps be stricter. > > We have a much bigger problem with draconian error handling > in XML proper > in general than with xml-stylesheet. So from our perspective, > defining > error recovery for XML 1.0 and Namespaces in XML 1.0 is a > higher priority. > > > > In this duplicate pseudo-attribute case, I could live with > > tightening my above suggestion to "...processor SHOULD ignore..." > > because at least that way Arbortext could say "yes, we should, > > but due to legacy issues, we decided instead to recover" and > > still not be non-compliant with the spec. > > Is the legacy situation for Arbortext so bad that people rely > on its error recovery behavior? Probably not. I was mostly using this as an example. > > >> > * What happens when a CharRef hits the [WFC: Legal > >> Character] constraint > >> > in XML 1.0? (Unclear to me whether this is allowed in > the syntax.) > >> > >> Syntax error: must ignore the entire PI. We should tighten up > >> the syntax > >> so that duplicate pseudo-attributes and NCRs that are syntax > >> errors in XML > >> 1.0 are also syntax errors in xml-stylesheet. > >> > > > > As I said before: > > > > As far as I can tell, the XML Rec says: > > > > [16] PI ::= '<?' PITarget (S (Char* - (Char* '?>' Char*)))? '?>' > > > > and Char is "any unicode character..." so I don't see how there > > could be a CharRef in a PI. > > Not on the XML 1.0 layer, but on the xml-stylesheet layer. > > [3] PseudoAttValue ::= ('"' ([^"<&] | > CharRef | PredefEntityRef)* > '"' > | "'" ([^'<&] | CharRef | > PredefEntityRef)* "'") > - (Char* '?>' Char*) > > http://www.w3.org/TR/xml-stylesheet/#NT-PseudoAttValue > Interesting--I missed that. So the PI is valid, but when it is parsed as an xml-stylesheet PI, the value of the pseudoattribute is discovered to have what is considered a charref to an illegal character. I'd probably say we should treat that as an invalid value for the pseudoattribute--however we decide to handle that (more below on that). > > >> > * What happens when there are unknown values? > >> > > > > In general, I don't see why we have to say anything about the > > values of attributes (except for 'alternate'). The original > > idea behind the Assoc SS spec was to define how to map the > > xml-stylesheet PI into the equivalent HTML 4.0 constructs, and > > then let the semantics be driven by HTML 4.0, and I see no > > reason to change that. Another concern I have is that, regardless of the details of what we say about invalid values, I don't want to require any implementation to verify values for attributes it's going to ignore. So an invalid/unknown value for a pseudoattribute should never require that the entire PI be ignored, because a given implementation might not even be looking at the value of that pseudoattribute. > > The reason is that the HTML4 spec does a poor job at specifying the > semantics and requirements. > > Maybe we could cite HTML5 instead, though? > We could, or we could live with HTML4's poor semantics. Most of the world does. > > >> Unexpected value for 'type': must either abort processing the PI or > >> continue as if type was absent. > > > > The Assoc SS spec currently requires the type attribute (so > continuing > > as if it were absent is equivalent to aborting, since it > isn't allowed > > to be absent). > > http://www.w3.org/1999/06/REC-xml-stylesheet-19990629/errata > Oops, I missed that. So it looks like nine years ago we'd already come to the conclusion that I came too once again. (Clearly, my memory doesn't extend that far back.) > > >> We should probably say that 'text/xsl' is > >> to be treated the same as 'text/xml' for the purposes of > 'type' (for > >> compat with existing content). > > > > I don't see why we have to say anything about the values of > > the type attribute. > > I guess it could be argued that this is something for the > XSLT spec to worry about. > > > >> Invalid IRI in 'href': interpret the value using the rules in "Web > >> Addresses" (currently called "URLs" and specified here: > >> http://www.whatwg.org/specs/web-apps/current-work/multipage/in > > frastructure.html#parsing-urls > >> ). If that returns an error: must ignore the entire PI. > > > > Likewise per my previous paragraph. We don't have to say anything > > about how to handle the value of the href attribute. > > > >> > >> 'media': refer to the Media Queries spec. If it's an invalid > >> media query, must ignore the entire PI. > > > > Again, I don't think we should say anything about how to handle > > the value of the media attribute. > > If we refer to the HTML4 spec here, then we're not being > helpful. If we refer to the HTML5 spec, then it's fine. We could refer to HTML5, or we could decide not to "be helpful". I don't say that facetiously. We could decide that the AssocSS spec merely defines how to map the xml-stylesheet PI into values for certain things like href, type, media, etc., and say that the interpretation of those values is left to other specs. On the other hand, I have no objections if we decide we want to specify which spec to reference for the interpretation of the various values. I need to hear from the rest of the WG about this. > > > >> > * Is it conforming for a document to have an > >> xml-stylesheet PI anywhere > >> > other than in the prologue? Is it used or ignored? > >> > >> Misplaced xml-stylesheet PI: must ignore the entire PI. > >> Documents must not use misplaced xml-stylesheet PIs. > > > > I agree, but there is more to this. > > > > The Assoc SS spec says: > > > > The xml-stylesheet processing instruction is allowed only in > > the prolog of an XML document. > > > > So an xml-stylesheet processor should ignore any PI not in > the prolog. > > That doesn't follow. The spec needs to require that separately. > I would read the spec as saying that any PI that looks like an xml-stylesheet PI but can't be (because it isn't in the prolog), isn't an xml-stylesheet PI, so it gets ignored by the xml-stylesheet PI processor. But I don't mind saying that explicitly. > > > I would add that an xml-stylesheet processor should ignore any PI > > that is not physically in the document entity. > > Yes. > > > > I would also add that, in the case of multiple xml-stylesheet PIs > > for the same media, the xml-stylesheet processor should ignore all > > but the last in document order. > > Why? Browsers support multiple for CSS. XSLT requires multiple to be > supported, too, although browsers generally use either the > first or last for XSLT. I guess I'm not sure what it means to have multiple xml-stylesheet PIs with the same media value, but if it does make sense, I'm okay with that at the xml-stylesheet processor level, as long as we don't require an editor/browser/composition application to necessarily know how to handle multiple ones. paul
Received on Thursday, 11 June 2009 14:55:13 UTC