RE: Potential new issue: PSVI considered harmful from Dare Obasanjo on 2002-06-12 (www-tag@w3.org from June 2002)

From: Dare Obasanjo <dareo@microsoft.com>
Date: Wed, 12 Jun 2002 12:09:40 -0700
To: "Tim Bray" <tbray@textuality.com>, <www-tag@w3.org>
Message-ID: <8BD7226E07DDFF49AF5EF4030ACE0B7E06621CD7@red-msg-06.redmond.corp.microsoft.com>
-----Original Message----- 
From: Tim Bray [mailto:tbray@textuality.com] 
Sent: Wed 6/12/2002 11:13 AM 
To: www-tag@w3.org 
Cc: 
Subject: Potential new issue: PSVI considered harmful


<Tim Bray> [Introductory note: I am not a W3C XML Schema expert, a PSVI expert, or
an XQuery expert, so it's perfectly possible that I'm way off-base here.
  My feelings won't be hurt in the slightest if someone points this out.]

[Dare Obasanjo] OK, no harm meant by the following comments on my part. 

<Tim Bray> The notion of a PSVI (Post-Schema Validation Infoset) has arisen out of
the W3C XML schema work, and is finding use in XPath2 and XQuery.  The
PSVI is distinguished from the normal XML infoset as follows:

  - the addition of default element/attribute values provided in the schema
  - addition of type information declared in the schema, i.e. you can
tell that the content of this attribute is supposed to be a date, and of
that element to be a floating-point number in the range -1.0..+1.0.


[Dare Obasanjo] I'd also add 

- addition of information regarding the success or lack thereof of validation of the elements and attributes information items in the infoset. 

- addition of tables that show relationships bindings between identity constraints (key/keyref, ID/IDREF) for the element information items in the XML document (the W3C XML Schema REC suggests that these probably shouldn't be surfaced to applications) 

<Tim Bray> The problem is that we are making the old SGML error all over again.  An
SGML document can't be parsed at all without reading the schema (DTD),
and the DTD conflated primitive typing, parsing support, entities,
default values, and other stuff in a really messy way.


[Dare Obasanjo] I'm not sure what this has to do with the PSVI. There is nothing in W3C XML Schema, XPath 2.0 or XQuery that mandates that an XML document has to have a schema for it to be consumed. In fact as Noah Mendellson pointed out earlier on this list even the xsi:schemaLocation/xsi:noNamespaceSchemaLocation pair are optional and do not mandate document validation or which W3C XML Schema schema to use. 

On the other hand XML 1.0 itself is tied very closely to DTDs and the attendant conflation of entities, default values, notations and weird primitive typing (NMToken, CDATA, IDREF???)  

<TimBray> 
There is nothing wrong whatsoever in annotating XML with type
information, but the PSVI suffers from the following flaws:

1. the inclusion of default values.  These are sufficiently problematic
that the IETF is about to recommend they not be used at all, and I for
one think there is a good case that they should be deprecated for
architectural reasons.

[Dare Obasanjo] Can you expand on what problems default values in the PSVI cause and also how they differ from those caused by default values in an XML documents DTD? 


<TimBray>
2. the notion that annotation is necessarily linked to validation.  The
problem here is with the "PSV" part of the name: there's nothing wrong
with a Type-Augmented Infoset (TAI), but why link it to validation?

It may be the case that the TAI is based on the simple data types from
XSD (although I would argue for cutting back to a more tractable and
less bloated subset), but the connection to schema if any should not
have anything to do with schema *processing*.


[Dare Obasanjo] I can't tell what you are objecting to here. Are you arguing against the name "PSVI" and proposing an alternative mechanism for augmenting an XML infoset with type information to claim this name or...? 


<TimBray>

So I recommend a TAG finding along the following lines:

1. Type-augmented XML is a good thing and a recommendation should be
prepared describing it both at the infoset and syntax level. (I gather
there is already some work along these lines in XML Schema?).  Serious
consideration should be given to 80/20 points rather than simply
re-using the plethora of primitive types from XML Schema.

[Dare Obasanjo] Proposing a new working group to duplicate the work of the W3C XML Schema working group needs more justification than you've given. In fact I really haven't seen any.  

<TimBray>2. Type-augmented XML has nothing to say about default values created in
any schema.

[Dare Obasanjo] I agree that default values and type information are orthogonal but they both exist in the PSVI because it is the Post Schema Validation Infoset which means it includes everything that can be obtained from validating an XML document. 


<TimBray> 3. Any software can create and/or use type-augmented XML, whether or not
any validation is being performed.

[Dare Obasanjo] How exactly will this new system you propose avoid conflicting with W3C XML Schema? 


<TimBray> 4. Work on XQuery and other things that require a Type-Augmented Infoset
must not depend on schema processing, and should not have normative
linkages to any schema language specifications.


[Dare Obasanjo] I suggest discussing this with the XML Query Working Group before drawing up such proclamations.
Received on Wednesday, 12 June 2002 15:10:19 UTC