[Bug 2441] xqx: character references

http://www.w3.org/Bugs/Public/show_bug.cgi?id=2441





------- Comment #18 from maxim.orgiyan@oracle.com  2006-09-29 10:02 -------
(In reply to comment #17)

Michael,

Section 3.7.1.3 states:

"Predefined entity references and character references are expanded into their
referenced strings, as described in 3.1.1 Literals."

And section 3.1.1 states:

"Each predefined entity reference is replaced by the character it represents
when the string literal is processed."

It doesn't say anything about how character refs are processed (as
far as I can see), but does give some example of string value with character
refs.

Given these descriptions, one possible algorithm, for example, is to process a
string
by first applying all entity ref replacements, and then all the character
reference replacements on the resulting string. Which is what at least
one processors I tried appears to do.

But yes, I agree with the common-sense interpretation David gives.

> I'm puzzled that you don't find the XQuery spec clear on the subject of how
> predefined entity references are handled. It seems eminently clear to me. 
> 
> There are three places they can occur: in string literals, in attribute
> content, and in element content.
> 
> For string literals, section 3.1.1 spells out the rules and seems entirely
> clear.
> 
> For attribute content, rule 1 says "Attribute value normalization is then
> applied to normalize whitespace and expand character references and predefined
> entity references. " This spells out the rules by reference to the XML
> specification (which describes the interaction of entity expansion and
> whitespace normalization): the rules are complicated, but I think they are
> unambiguous.
> 
> For element content, section 3.7.1.3 rule 1b gives the rules by reference to
> the rules in 3.1.1 for string literals.
> 
> So what exactly is it that you think isn't stated clearly in the XQuery
> specification?
> 
> (You alleged that one implementation did double-expansion of entity references,
> turning &< into a less-than-sign. I think it's quite clear in the XQuery
> spec that processors mustn't do that. If you're in element content, for
> example, no possible reading of section 3.7.1.3 would allow that
> interpretation. In any case, as David Carlisle points out, common sense should
> give you the same answer: if an ampersand written as & were treated in the
> same way as one written as &, why would the specification bother to provide a
> way of escaping the character in the first place?)
> 
> Michael Kay
> 

Received on Friday, 29 September 2006 10:02:36 UTC