& in PIs (was Re: xml-stylesheet issues--suggested resolutions) from Henry S. Thompson on 2009-06-16 (public-xml-core-wg@w3.org from June 2009)

From: Henry S. Thompson <ht@inf.ed.ac.uk>
Date: Tue, 16 Jun 2009 15:48:52 +0100
To: "Simon Pieters" <simonp@opera.com>
Cc: "Grosso, Paul" <pgrosso@ptc.com>, public-xml-core-wg@w3.org
Message-ID: <f5bfxe0p73f.fsf_-_@hildegard.inf.ed.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Simon Pieters writes:

> . . .
> There's one bug that cites this test case:
>
>    http://home.arcor.de/martin.honnen/operaBugs/op9/XML/ampersandInPI2.xml
>
> The bug says that Opera is wrong in aborting parsing. I think Firefox,
> Safari and IE ignore the PI here.
>
> We have a much bigger problem with draconian error handling in XML
> proper  in general than with xml-stylesheet. So from our perspective,
> defining  error recovery for XML 1.0 and Namespaces in XML 1.0 is a
> higher priority.
>
>
>> In this duplicate pseudo-attribute case, I could live with
>> tightening my above suggestion to "...processor SHOULD ignore..."
>> because at least that way Arbortext could say "yes, we should,
>> but due to legacy issues, we decided instead to recover" and
>> still not be non-compliant with the spec.
>
> Is the legacy situation for Abortext so bad that people rely on its
> error  recovery behavior?
>
>
> On Wed, 10 Jun 2009 16:28:57 +0200, Grosso, Paul <pgrosso@ptc.com> wrote:
>
>> > * What happens when a CharRef hits the [WFC: Legal Character] constraint
>> > in XML 1.0? (Unclear to me whether this is allowed in the syntax.)
>>>
> Syntax error: must ignore the entire PI. We should tighten up
> the syntax
> so that duplicate pseudo-attributes and NCRs that are syntax
> errors in XML
> 1.0 are also syntax errors in xml-stylesheet.
>>
>> As I said before:
>>
>> As far as I can tell, the XML Rec says:
>>
>> [16] PI ::= '<?' PITarget (S (Char* - (Char* '?>' Char*)))? '?>'
>>
>> and Char is "any unicode character..." so I don't see how there
>> could be a CharRef in a PI.
>
> Not on the XML 1.0 layer, but on the xml-stylesheet layer.
>
> [3]   	PseudoAttValue	   ::=   	('"' ([^"<&] | CharRef |
> PredefEntityRef)*  '"'
> 			| "'" ([^'<&] | CharRef | PredefEntityRef)* "'")	
> 			- (Char* '?>' Char*)
> 	
> http://www.w3.org/TR/xml-stylesheet/#NT-PseudoAttValue

Oh B*G.  _Why_ did they/we do that?  That's just . . . perverse.

So, either we junk that and go with what Paul (and I) _thought_ was
the case, and _nobody_ is conformant, or we go with Simon's suggestion
- -- it's a PIE* if you hit a PAV** that doesn't match production [3]
above.

Unfortunately, I guess on balance we're stuck with this nonsense, so
we go with Simon.

ht

* Pseudo-attribute ignore error
** Pseudo-attribute value
- -- 
       Henry S. Thompson, School of Informatics, University of Edinburgh
                         Half-time member of W3C Team
      10 Crichton Street, Edinburgh EH8 9AB, SCOTLAND -- (44) 131 650-4440
                Fax: (44) 131 651-1426, e-mail: ht@inf.ed.ac.uk
                       URL: http://www.ltg.ed.ac.uk/~ht/
[mail really from me _always_ has this .sig -- mail without it is forged spam]
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)

iD8DBQFKN7DUkjnJixAXWBoRAvWOAJ9jSmwrzntUuAj+0+szdKDFZJyuqwCfQzBY
N32upuC6AnLpBN20Ub5HEbY=
=ppTk
-----END PGP SIGNATURE-----

Received on Tuesday, 16 June 2009 14:49:58 UTC