Re: On subsetting XML...

At 10:46 PM -0500 1/16/03, noah_mendelsohn@us.ibm.com wrote:

>To pick one concrete example of possible complication:
>what does this do to digital signatures?  You propose
>that PIs be allowed but "ignored" by SOAP receivers.
>The current W3C canonicalizations retain PIs for
>signing, but the signature rec acknowledges that other
>canonicalizations may not.[1] So now we have to start
>telling stories about which signatures hold and which
>are broken when a SOAP intermediary node chooses to drop
>the insignificant PI.  Implementations come under pressure
>to store and retain the meaningless PIs after all, just
>so that signatures won't break, or we have to go to the
>trouble or promoting yet another canonicalization
>(which we might do anyway, but this shouldn't be the
>trigger.)

That's not hard to fix: don't drop them, just ignore them. In most 
cases it's easier to preserve them than to get rid of them.


>>  That is, how can SOAP require an XML subset that
>>  forbids (i.e., does not include) PIs?

...
>  Must we say that all
>possible attributes MUST be allowed, but that they are
>to be ignored if senseless?  Obviously not.  That's
>what we're being asked to say about PIs in SOAP.

There's a clear contrast between the cases here, though admittedly 
PIs are right on the line. Restricting elements and attributes to 
those with particular names and namespaces in particular locations in 
the document is semantic restriction. This is a very different beast 
than the syntactic restriction you're proposing. The SOAP elimination 
of DOCTYPE is completely syntactic, and thus wrong. A process should 
not care whether an attribute was defaulted from the DTD or included 
literally in the document. (For maximum interoperability, the 
document should not rely on default attribute values but for the same 
reason a parser should always apply default attribute values.)

PIs are a syntactic mechanism intended to enable an escape beyond the 
usual semantic restrictions of which elements and attributes appear 
where. This is an important part of XML's extensibility. By 
eliminating them, you're restricting the syntactic expressiveness of 
XML.

>So I claim that almost all XML applications use
>a subset of XML.  Editors, databases parsers and
>such are the exceptions.  SOAP is an application
>that uses the appropriate features of XML, and
>prohibits the use of other constructions.

Again, you're confusing the level of syntax with the level of 
semantics. Many applications restrict the semantics they'll 
understand. No good ones restrict the syntax they'll understand.
-- 

+-----------------------+------------------------+-------------------+
| Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer |
+-----------------------+------------------------+-------------------+
|           Processing XML with Java (Addison-Wesley, 2002)          |
|              http://www.cafeconleche.org/books/xmljava             |
| http://www.amazon.com/exec/obidos/ISBN%3D0201771861/cafeaulaitA  |
+----------------------------------+---------------------------------+
|  Read Cafe au Lait for Java News:  http://www.cafeaulait.org/      |
|  Read Cafe con Leche for XML News: http://www.cafeconleche.org/    |
+----------------------------------+---------------------------------+

Received on Friday, 17 January 2003 09:26:52 UTC