Re: xml11Names-46 from noah_mendelsohn@us.ibm.com on 2004-06-14 (www-tag@w3.org from June 2004)

From: <noah_mendelsohn@us.ibm.com>
Date: Mon, 14 Jun 2004 17:10:38 -0400
To: Dan Connolly <connolly@w3.org>
Cc: Elliotte Rusty Harold <elharo@metalab.unc.edu>, Jacek Kopecky <jacek.kopecky@deri.at>, www-tag@w3.org, "David Ezell" <David_E3@VERIFONE.com>
Message-ID: <OF85111AB0.00719D2E-ON85256EB3.0072D419@lotus.com>

First of all, I share Elliotte's concerns with rewriting history.  Readers 
of the current recommendation have good reason to expect that our explicit 
references to XML 1.0 Version 2 and the 1999 version of Namespaces mean 
what they say. 

One challenge I have not seen discussed in this thread relates to Schema 
datatypes.  I claim that providing an architecture that validates 
documents in isolation as XML 1.0 or XML 1.1 is only part of the problem. 
Datatypes such as xsd:string and xsd:QName have implications far beyond 
their use in XML validation.   Most of the proposals I've seen to support 
XML 1.1 essentially muddy the definitions of those types:  sometimes they 
define XML 1.0 constructs and sometimes XML 1.1.  That strikes be as at 
best unfortunate and it worst intractable.

Those of us who build systems that create, consume and/or manage XML often 
use these types to drive our data mappings.  Strings, for example, are 
often mapped to C language null terminated strings.  I know that's a safe 
thing to do because the xsd:string type is limited to the XML 1.0 chars 
(hence no nulls).  While XML 1.1 does not introduce nulls, the proposal to 
forward reference as yet unwritten versions of XML provides no such 
guarantee, and in that sense renders much of the Schema type system 
toothless.   Similarly, one of the challenges in mapping XML to databases 
and programming systems is to decide how element names are to be tracked. 
Though I cannot point to any deployed systems that have problems with XML 
1.1 names in particular,  it would be reasonable for such a system to 
assume that anything validated by xsd:Name or xsd:QName obeyed the 
existing constraints in the recommendations.  Finally, systems such as XML 
Query and its Functions and Operators operate on values typed by the 
Schema type system (e.g. functions that return an element name).  Such 
values can wind up in databases for extended periods, and may be combined 
with values from other documents during a query.   I think that any 
proposal to handle XML 1.0 and XML 1.1 needs to deal with realistic use 
cases involving combinations of XML 1.0 and XML 1.1 vocabularis, both 
within a single document and when joining multiple documents in a 
database.  Having two definitions for the same type (e.g. xsd:Name) 
strikes me as dangerous and unduly tricky. 

So, I suggest that serious consideration be given not just to individual 
validation episodes, but also to the integrity of the type system.

Dan Connolly writes:

>> I'm quite sympathetic to this point... I'd 
>> like to know where it should be discussed, 
>> because www-tag is not the place.

Discussion, at least among members of the Schema WG, has started on the 
schema-IG list.

--------------------------------------
Noah Mendelsohn 
IBM Corporation
One Rogers Street
Cambridge, MA 02142
1-617-693-4036
--------------------------------------

Received on Monday, 14 June 2004 17:14:19 UTC