W3C home > Mailing lists > Public > public-xml-core-wg@w3.org > June 2009

Re: xml-stylesheet issues--suggested resolutions

From: Simon Pieters <simonp@opera.com>
Date: Thu, 11 Jun 2009 10:27:03 +0200
To: "Grosso, Paul" <pgrosso@ptc.com>, public-xml-core-wg@w3.org
Message-ID: <op.uvcprdqoidj3kv@simon-pieterss-macbook.local>
On Wed, 10 Jun 2009 16:28:57 +0200, Grosso, Paul <pgrosso@ptc.com> wrote:

>> > * What happens when the PI is XML 1.0-well-formed but
>> doesn't follow the
>> > xml-stylesheet syntax?
>> > * What happens when there are duplicate pseudo-attributes?
>> (This seems
>> > to actually be allowed in the syntax.)
> I suggest:
>  This is an error; the xml-stylesheet processor MAY ignore the
>  entire PI; if it tries to recover, it SHOULD ignore all but the
>  last assignment to a given pseudo-attribute.
> This is what Arbortext currently does, and if we change the spec
> to say "MUST ignore the entire PI", and we change our code to be
> compliant, some user documents will suddenly stop working.  If we
> don't change our code, then we would be non-compliant which looks
> bad both for Arbortext and the AssocSS spec (because it generally
> looks bad for a spec when implementors ignore it).
> In fact, from an XML Core point of view, I'm less worried about
> what Arbortext does than what the "major browser vendors" do.
> If we start changing the AssocSS spec to make current behavior
> completely non-compliant, I'm quite sure there will be cases
> (such as this one in Arbortext's case) where they will decide
> they just can't invalidate existing documents, so they will
> ignore the spec.  I'd rather not be in the position of setting
> ourselves up to be ignored.

I doubt there is enough legacy content with invalid xml-stylesheet PIs to  
make browser vendors ignore the spec. I say this because there are  
surprisingly few bugs reported on Opera for our Draconian handling of  
invalid xml-stylesheet PIs.

There's one bug that cites this test case:


The bug says that Opera is wrong in aborting parsing. I think Firefox,  
Safari and IE ignore the PI here.

We have a much bigger problem with draconian error handling in XML proper  
in general than with xml-stylesheet. So from our perspective, defining  
error recovery for XML 1.0 and Namespaces in XML 1.0 is a higher priority.

> In this duplicate pseudo-attribute case, I could live with
> tightening my above suggestion to "...processor SHOULD ignore..."
> because at least that way Arbortext could say "yes, we should,
> but due to legacy issues, we decided instead to recover" and
> still not be non-compliant with the spec.

Is the legacy situation for Abortext so bad that people rely on its error  
recovery behavior?

>> > * What happens when a CharRef hits the [WFC: Legal
>> Character] constraint
>> > in XML 1.0? (Unclear to me whether this is allowed in the syntax.)
>> Syntax error: must ignore the entire PI. We should tighten up
>> the syntax
>> so that duplicate pseudo-attributes and NCRs that are syntax
>> errors in XML
>> 1.0 are also syntax errors in xml-stylesheet.
> As I said before:
> As far as I can tell, the XML Rec says:
> [16] PI ::= '<?' PITarget (S (Char* - (Char* '?>' Char*)))? '?>'
> and Char is "any unicode character..." so I don't see how there
> could be a CharRef in a PI.

Not on the XML 1.0 layer, but on the xml-stylesheet layer.

[3]   	PseudoAttValue	   ::=   	('"' ([^"<&] | CharRef | PredefEntityRef)*  
			| "'" ([^'<&] | CharRef | PredefEntityRef)* "'")	
			- (Char* '?>' Char*)

>> > * What happens when there are unknown values?
> In general, I don't see why we have to say anything about the
> values of attributes (except for 'alternate').  The original
> idea behind the Assoc SS spec was to define how to map the
> xml-stylesheet PI into the equivalent HTML 4.0 constructs, and
> then let the semantics be driven by HTML 4.0, and I see no
> reason to change that.

The reason is that the HTML4 spec does a poor job at specifying the  
semantics and requirements.

Maybe we could cite HTML5 instead, though?

>> Unexpected value for 'type': must either abort processing the PI or
>> continue as if type was absent.
> The Assoc SS spec currently requires the type attribute (so continuing
> as if it were absent is equivalent to aborting, since it isn't allowed
> to be absent).


>> We should probably say that 'text/xsl' is
>> to be treated the same as 'text/xml' for the purposes of 'type' (for
>> compat with existing content).
> I don't see why we have to say anything about the values of
> the type attribute.

I guess it could be argued that this is something for the XSLT spec to  
worry about.

>> Invalid IRI in 'href': interpret the value using the rules in "Web
>> Addresses" (currently called "URLs" and specified here:
>> http://www.whatwg.org/specs/web-apps/current-work/multipage/in
> frastructure.html#parsing-urls
>> ). If that returns an error: must ignore the entire PI.
> Likewise per my previous paragraph.  We don't have to say anything
> about how to handle the value of the href attribute.
>> 'media': refer to the Media Queries spec. If it's an invalid
>> media query, must ignore the entire PI.
> Again, I don't think we should say anything about how to handle
> the value of the media attribute.

If we refer to the HTML4 spec here, then we're not being helpful. If we  
refer to the HTML5 spec, then it's fine.

>> >  * Is it conforming for a document to have an
>> xml-stylesheet PI anywhere
>> > other than in the prologue? Is it used or ignored?
>> Misplaced xml-stylesheet PI: must ignore the entire PI.
>> Documents must not use misplaced xml-stylesheet PIs.
> I agree, but there is more to this.
> The Assoc SS spec says:
>  The xml-stylesheet processing instruction is allowed only in
>  the prolog of an XML document.
> So an xml-stylesheet processor should ignore any PI not in the prolog.

That doesn't follow. The spec needs to require that separately.

> I would add that an xml-stylesheet processor should ignore any PI
> that is not physically in the document entity.


> I would also add that, in the case of multiple xml-stylesheet PIs
> for the same media, the xml-stylesheet processor should ignore all
> but the last in document order.

Why? Browsers support multiple for CSS. XSLT requires multiple to be  
supported, too, although browsers generally use either the first or last  
for XSLT.

Simon Pieters
Opera Software
Received on Thursday, 11 June 2009 08:27:46 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:16:40 UTC