Re: Options for dealing with IDs

On Thursday, January 9, 2003, 9:16:49 PM, noah wrote:

nuic> I'm in complete agreement with Tim. SOAP is an example of a
nuic> system that uses xsd:ID [1] typed attributes and which goes to
nuic> some lengths to NOT *require* validation of any sort, though
nuic> partial or full validation of the message is permitted if useful
nuic> to the consuming application. [2,3]. SOAP's ID attributes are in
nuic> its own namespace.

Okay so that is a strike against xml:id because SOAP could not use it
unless SOAP was changed. Its also a strike against mandatory DTD or
Schema validation. And the whole reason that SOAP does this is because
it requires to know and make use of the concept of IDness in a
well-formed instance.

I would be interested in hearing from the SOAP folks whether they
would prefer to switch their ids (and the ids of any other namespaces
in the payload) to being xml:id, or whether they would prefer to have

xmlns:s="http://www.w3.org/2002/12/soap-encoding" xml:idAttr="s:id"

as the price of wider interoperability. Or whether they would rather
do nothing, which seems risky given that some people (I read Tim Brays
arguments this way, at least some of them) are saying that well formed
instances *do not have* IDs.

nuic> IMO, DTD or schema retrieval, as well as validation, is often
nuic> impractical for performance and security reasons, among others.
nuic> Interestingly, among the many performance risks we analysed in
nuic> the schema WG were reports from implementors that failures to
nuic> retrieve (timeouts) had bigger performance impact than
nuic> successes. Furthermore, tf validation is required, then the
nuic> document becomes useless in the case where an external DTD or
nuic> schema is for some reason unavailable. I think this is
nuic> unacceptable.

I agree it is unacceptable, certainly in a performance-critical
industrial-strength environment.

nuic> I think I agree with Tim's other conclusion: do nothing is
nuic> probably the least risky solution. We've got too many typing
nuic> mechanisms already.

Including one more, non-machine-readable one in the SOAP spec. Once
you start copy and pasting that wording into other specs (if its in a
W3C REC then it must be the right thing to do) then how is an XML
processor supposed to know that

<foo xmlns:x="http://www.example.org/foogle" x:blort="i3"/>

is an ID just because the documentation for that namespace (handily
pointed to by indirection through some document at the namespace URI)
says (in human-readable prose) that the blort attribute has the
Infoset properties of

- a local name of blort
- a namespace name of http://www.example.org/foogle
- a type of ID in the http://www.w3.org/2001/XMLSchema namespace

whereas a machine-readable single attribute value could convey that
information to a parser straight away and allow a suitable infoset to
be built very quickly.

I'm surprised, given the chance to establish a high performance
mechanism for interoperable IDness in well formed instances, plus the
risk of such low-performance processing such as "have a human read the
spec for this namespace" that people who want to make money from
high-performance XML processing are not running towards, not walking,
but sitting down and saying its too complex when these potential
solutions are listed.

-- 
 Chris                            mailto:chris@w3.org

Received on Friday, 10 January 2003 11:14:52 UTC