RE: Options for dealing with IDs from noah_mendelsohn@us.ibm.com on 2003-01-09 (www-tag@w3.org from January 2003)

From: <noah_mendelsohn@us.ibm.com>
Date: Wed, 8 Jan 2003 20:40:27 -0500
To: "Bullard, Claude L (Len)" <clbullar@ingr.com>
Cc: "'Elliotte Rusty Harold'" <elharo@metalab.unc.edu>, www-tag@w3.org
Message-ID: <OF227F0C0F.45078CE4-ON85256CA9.00070AFF@lotus.com>

I'm somewhat nervous about option 6, and perhaps some of the others, as 
they relate to XML Schema. 

I think we have roughly the following situation.  XML Schema takes an 
Infoset as its input.  That Infoset may be produced by a non-validating 
XML processor, a DTD-validating XML processor, or might be a synthetic 
Infoset.  XML schema provides a built in datatype xsd:ID [1].  When used 
in an XML schema as the type of an attribute XML Schema structures 
provides constraint checking analgous to what would happen with DTDs [2]. 
As far as I know, this schema-level checking is independent of any that 
might have been done per a DTD.  As a result of such a validation episode, 
the processor can report in the PSVI that a given attribute has been 
determined to be of type xsd:ID or xsd:IDREF.  Furthermore, Schema 
introduces the so-called identity constraint mechanism [3] (key/keyref) 
which is more general than ID/IDREF;  I think it's fair to say that 
xsd:ID/xsd:IDREF is provided primarily for backwards compatibility, in the 
sense of allowing reasonably straightforward conversion of DTDs to 
schemas.

So, schema largely reproduces XML 1.0 ID/IDREF, but it does so using an 
Infoset (which may have already been the result of DTD-validation) as 
input.  Thus the notions of ID in XML 1.0 and ID in Schema are 
intentionally similar, but in some sense duplicate each other. 

Option 6 introduces: 

        xml:idAttr="name"

(where name is a sample value.)  Question: how should this interact with 
the mechanisms of XML schema.  What if a name attribute is declared as 
being of type xsd:Integer.  Keep in mind that schema takes Infoset as 
input.  Does the assigned ID type now show up in that input infoset?  What 
are the right rules for schema processing?  Should the type reported in 
the PSVI be xsd:ID (or is there a separate xml:IDtype?) for the name 
attribute?

This all strikes me as a mess.  Whatever the other pros and cons of option 
6, I think these anomalies result in part from the fact that XML does not 
really offer the pluggability that would allow schema to participate as a 
first class replacement for DTDs.  Accordingly, schema does the best it 
can running at a separate layer, but when we try to re-introduce typing at 
the XML level as well we run the risk of complexity creeping in.  I could 
be wrong, but if we go with option 6 I suspect we would probably want to 
rev XML schema to take account of it (and there are all sorts of 
deployement issues in reving XML schema, I would think.)

[1] http://www.w3.org/TR/xmlschema-2/#ID
[2] http://www.w3.org/TR/xmlschema-1/#cvc-id
[3] http://www.w3.org/TR/xmlschema-1/#cIdentity-constraint_Definitions

------------------------------------------------------------------
Noah Mendelsohn                              Voice: 1-617-693-4036
IBM Corporation                                Fax: 1-617-693-8676
One Rogers Street
Cambridge, MA 02142
------------------------------------------------------------------

Received on Thursday, 9 January 2003 12:31:21 UTC