content of PIs and comments


in the data model spec, there are two points -- respectively 4.6.1 third item of 
the last list and 4.7.1 second item of the last list -- that I find surprising.

They say that a PI's data cannot contain "?>" and that a comment's content 
cannot contain "--". That's true at the lexical (XML 1.x) level, but I believe 
it is wrong at the Infoset level. The following document appears to me to be 
well-formed, and indeed several parsers seem happy about it:

     <?pi char?&gt;char?>
     <!-- comment &#41;- foo -->

I believe this generates a PI the content of which is "char?>char" and a comment 
the content of which is " comment -- foo " even though naturally they must be 
escaped when serialised.

Is there a reason for these limitations?

PS: there's also a small cut-and-paste typo in 4.6.1 where it says "Namespace 
nodes" when it means "Processing instruction nodes".

Robin Berjon <>
Research Scientist, Expway
7FC0 6F5F D864 EFB8 08CE  8E74 58E6 D5DB 4889 2488

Received on Monday, 15 September 2003 13:07:40 UTC