RE: Closing XML Protocol Last Call issue 395

Gudge writes:

> I'm not sure why this issues revolves around the
> internal subset. We explicitly prohibit the Document
> Type Declaration Information Item from appearing.

So far, so good.  We agree.

>> If there is no DTD then there is no internal 
>> or external subset. 

Let's be a little careful.  Our infosets are synthetic.  They come before 
the lexical form is even considered.  Clearly we disallow the info item. 
What this means for any possible serialization in any possible binding is 
unclear.

> Lexically one cannot have <!DOCTYPE ... in a SOAP message. 

Now we're talking about something binding specific.  Assume we're talking 
about >the< SOAP HTTP binding. 

>> The only parts of the DTD that are reflected 
>> in the infoset are unparsed entities, notations 
>> and PIs appearing the in DTD.

Right, so if I had a lexical form with an internal subset declaring a 
parsed entity, then that would not show up in the Infoset when I parsed 
the document.  I couldn't tell that there had been an internal or external 
subset.

Now, go the other way.  We say in the HTTP binding that we want 
(indirectly through RFC 3203) the XML 1.x serialization of the infoset. 
But if what I say in the para above is right (and I'm not sure about it), 
that's ambiguous.  There are at lexical forms with internal subset that 
correspond to the Infoset that has no DTD information item.  That is the 
source of my concern.  If there is even a hint of this ambiguity, I think 
our binding (or the RFC if appropriate) needs to say explicitly: 
"<!DOCTYPE ... > MUST NOT appear."

I feel like I may be confused, but in the meantime, I remain concerned 
that there is an ambiguity.  If someone sent an instance with internal 
subset, but that parsed into an Infoset with no Doctype Info Item, I'd not 
sure where I'd point in the spec to say "you broke the rules."  What am I 
missing?  Thanks.

------------------------------------------------------------------
Noah Mendelsohn                              Voice: 1-617-693-4036
IBM Corporation                                Fax: 1-617-693-8676
One Rogers Street
Cambridge, MA 02142
------------------------------------------------------------------

Received on Thursday, 5 December 2002 14:45:34 UTC