RE: xml-stylesheet issues--suggested resolutions from Grosso, Paul on 2009-06-10 (public-xml-core-wg@w3.org from June 2009)

From: Grosso, Paul <pgrosso@ptc.com>
Date: Wed, 10 Jun 2009 10:28:57 -0400
To: <public-xml-core-wg@w3.org>
Message-ID: <CF83BAA719FD2C439D25CBB1C9D1D3020FEA420E@HQ-MAIL4.ptcnet.ptc.com>
Simon,

Thanks for restarting this discussion.  I've embedded some
comments (repeating some I've made before, but I thought
it would help the discussion).

> -----Original Message-----
> From: Simon Pieters [mailto:simonp@opera.com] 
> Sent: Wednesday, 2009 June 10 7:00
> To: Grosso, Paul; public-xml-core-wg@w3.org
> Subject: Re: xml-stylesheet issues--suggested resolutions
> 
> On Sat, 18 Apr 2009 12:48:35 +0200, Simon Pieters 
> <simonp@opera.com> wrote:
> 
> >> In most cases, I'm tempted to say "is an error; the
> >> xml-stylesheet processor MAY ignore the entire PI; if
> >> it tries to recover, it SHOULD xxxx."  Thoughts?
> >
> > I would prefer if for different errors it was either "is an 
> error: MUST  
> > ignore the entire PI" or "is an error: MUST recover as 
> follows: xxxx".
> 
> My preference for which of those to follow for different 
> errors are as follows:
> 
> 
> On Tue, 17 Feb 2009 17:38:22 +0100, Simon Pieters 
> <simonp@opera.com> wrote:
> 
> > * What happens when the PI is XML 1.0-well-formed but 
> doesn't follow the  
> > xml-stylesheet syntax?
> 
> > * What happens when there are duplicate pseudo-attributes? 
> (This seems  
> > to actually be allowed in the syntax.)

I suggest:

 This is an error; the xml-stylesheet processor MAY ignore the
 entire PI; if it tries to recover, it SHOULD ignore all but the
 last assignment to a given pseudo-attribute.

This is what Arbortext currently does, and if we change the spec
to say "MUST ignore the entire PI", and we change our code to be
compliant, some user documents will suddenly stop working.  If we
don't change our code, then we would be non-compliant which looks
bad both for Arbortext and the AssocSS spec (because it generally
looks bad for a spec when implementors ignore it).

In fact, from an XML Core point of view, I'm less worried about 
what Arbortext does than what the "major browser vendors" do.
If we start changing the AssocSS spec to make current behavior
completely non-compliant, I'm quite sure there will be cases
(such as this one in Arbortext's case) where they will decide
they just can't invalidate existing documents, so they will 
ignore the spec.  I'd rather not be in the position of setting
ourselves up to be ignored.

In this duplicate pseudo-attribute case, I could live with 
tightening my above suggestion to "...processor SHOULD ignore..."
because at least that way Arbortext could say "yes, we should,
but due to legacy issues, we decided instead to recover" and
still not be non-compliant with the spec.

> >
> > * What happens when a CharRef hits the [WFC: Legal 
> Character] constraint  
> > in XML 1.0? (Unclear to me whether this is allowed in the syntax.)
> 
> Syntax error: must ignore the entire PI. We should tighten up 
> the syntax  
> so that duplicate pseudo-attributes and NCRs that are syntax 
> errors in XML  
> 1.0 are also syntax errors in xml-stylesheet.
> 

As I said before:

As far as I can tell, the XML Rec says:

[16] PI ::= '<?' PITarget (S (Char* - (Char* '?>' Char*)))? '?>'

and Char is "any unicode character..." so I don't see how there
could be a CharRef in a PI.

> 
> > * What happens when there are unknown pseudo-attributes?
> 
> Must recover by ignoring unknown pseudo-attributes.
> 
> 
> > * What happens when there are unknown values?
> 

In general, I don't see why we have to say anything about the 
values of attributes (except for 'alternate').  The original
idea behind the Assoc SS spec was to define how to map the 
xml-stylesheet PI into the equivalent HTML 4.0 constructs, and 
then let the semantics be driven by HTML 4.0, and I see no 
reason to change that.  Once the value of a given pseudo-attribute
has been determined, I suggest that the Assoc SS spec needs say 
nothing further (except for 'alternate').

> > * Browsers support type="text/xsl" but text/xsl is not a 
> registered  
> > media type and is not an XML media type per RFC 3023.
> 
> > * media='' references HTML4 which is outdated; browsers use 
> the Media  
> > Queries spec here.
> 
> Invalid value for 'alternate': must recover by acting as if 
> the value was 'no'.

Agreed.

> 
> Unexpected value for 'type': must either abort processing the PI or  
> continue as if type was absent.

The Assoc SS spec currently requires the type attribute (so continuing
as if it were absent is equivalent to aborting, since it isn't allowed
to be absent).

In my previous email, I said I'm not sure why the spec required 
the type pseudo-attribute.  It doesn't seem to be required by 
the HTML spec, and Arbortext doesn't make use of it.  Perhaps
we should not require it--or at least, not require that
a processor ignore the whole PI if it is missing.  After
all, the processor can use the href value to find the
resource whose type might be obvious by inspection.  Should
we change things so that it isn't required?

> We should probably say that 'text/xsl' is  
> to be treated the same as 'text/xml' for the purposes of 'type' (for  
> compat with existing content).

I don't see why we have to say anything about the values of
the type attribute. 

> 
> Invalid IRI in 'href': interpret the value using the rules in "Web  
> Addresses" (currently called "URLs" and specified here:  
> http://www.whatwg.org/specs/web-apps/current-work/multipage/in
frastructure.html#parsing-urls  
> ). If that returns an error: must ignore the entire PI.

Likewise per my previous paragraph.  We don't have to say anything
about how to handle the value of the href attribute.

> 
> 'media': refer to the Media Queries spec. If it's an invalid 
> media query, must ignore the entire PI.

Again, I don't think we should say anything about how to handle 
the value of the media attribute.

> 
> 
> > * When is the processing of the PI invoked?
> >   - What happens if you change the PI's 'data'?
> >   - What happens if you change the PI's 'target'?
> >   - What happens if you remove the PI from the DOM?
> >   - What happens if you add the PI to the DOM (with scripting)?
> >   - What happens if you insert the PI somewhere other than in the  
> > prologue?
> >   - What happens if the PI is a child of Document but after 
> the root  
> > element and you then move the root element so that the PI 
> becomes part  
> > of the prologue?
> 
> In browsers, for CSS things are updated on the fly, but for 
> XSLT changes  
> are ignored. We could leave this up to the CSS and XSLT (and 
> other) specs.

I agree that we should leave this up to other specs.

> 
> 
> >  * Is it conforming for a document to have an 
> xml-stylesheet PI anywhere  
> > other than in the prologue? Is it used or ignored?
> 
> Misplaced xml-stylesheet PI: must ignore the entire PI. 
> Documents must not use misplaced xml-stylesheet PIs.

I agree, but there is more to this.

The Assoc SS spec says:

 The xml-stylesheet processing instruction is allowed only in
 the prolog of an XML document. 

So an xml-stylesheet processor should ignore any PI not in the prolog.

I would add that an xml-stylesheet processor should ignore any PI
that is not physically in the document entity.

I would also add that, in the case of multiple xml-stylesheet PIs 
for the same media, the xml-stylesheet processor should ignore all
but the last in document order.

> 
> > * If charset is specified and the PI points to an XSLT 
> transformation,  
> > should the charset='' information be used?
> 
> No, RFC 3023 and XML 1.0 say which encoding to use.

And, in fact, I don't think the AssocSS spec needs to
say anything about this.

> 
> 
> > * CSSOM integration:  
> > 
> http://dev.w3.org/cvsweb/~checkout~/csswg/cssom/Overview.html?
> content-type=text/html;%20charset=utf-8#the-linkstyle  
> > defines the LinkStyle interface that HTML <link> and 
> <?xml-stylesheet?>  
> > implement -- we should coordinate with Anne here.
> >
> > * CSS issues: it's unclear whether referencing an element 
> should work if  
> > type="text/css" -- the type of the document would be an XML 
> type which  
> > is not a CSS type, and browsers largely don't support this anyway.

Simon didn't comment further on this topic in his latest email,
so I assume we are agreed that the AssocSS spec doesn't need
to say anything here.  For more on this, see the last part of
http://lists.w3.org/Archives/Public/public-xml-core-wg/2009Apr/0029

paul
Received on Wednesday, 10 June 2009 14:34:20 UTC