Re: Tight or Loose Data Constraints? from Michael Kay on 2011-01-23 (xmlschema-dev@w3.org from January 2011)

From: Michael Kay <mike@saxonica.com>
Date: Sun, 23 Jan 2011 00:35:16 +0000
To: "Costello, Roger L." <costello@mitre.org>
CC: "xmlschema-dev@w3.org" <xmlschema-dev@w3.org>
Message-ID: <4D3B77C4.4000902@saxonica.com>

>> In this paper I argue that XML Schemas should implement tight data constraints.
>>
>>
>> More ... http://www.xfront.com/Tight-or-Loose-Data-Contraints.pdf
>>
>>

The line of reasoning appears to be something like:

* In my sample of N names, none of them contains non-ASCII characters

* I can't be bothered to test whether the system works with non-ASCII 
characters

* Therefore, I'll stop non-ASCII characters getting into the system.

Now, I can understand that there might be projects where cost and 
timescale considerations force this kind of approach. But turning that 
into a model for how schemas "should" be designed seems to be turning an 
implementation short-cut into a virtue. The ideal to which one should 
aspire, surely, is to discover what the actual set of allowed real-world 
names is, and to make sure the IT system can handle them all, so that 
you don't have to turn users away or force them to enter incorrect names 
into the system.

In an ideal world, the constraints in the schema would only reflect 
external constraints that exist in the real world, and not constraints 
imposed by the IT system.

I used an Australian web site last week that wouldn't allow me in unless 
I entered a valid Australian postcode. I managed to invent one. Any 
system with validation rules that force users to enter false invented 
data is badly designed, in my book. And that includes forcing them to 
give an ASCII approximation of their name.

Michael Kay
Saxonica

Received on Sunday, 23 January 2011 00:35:43 UTC