Re: Streamable schema-aware processing from C. M. Sperberg-McQueen on 2015-10-15 (public-xsl-wg@w3.org from October 2015)

From: C. M. Sperberg-McQueen <cmsmcq@acm.org>
Date: Thu, 15 Oct 2015 11:50:40 -0600
To: cmsmcq@blackmesatech.com
Cc: "C. M. Sperberg-McQueen" <cmsmcq@acm.org>, "Michael Kay" <mike@saxonica.com>, "Public XSLWG" <public-xsl-wg@w3.org>
Message-Id: <3E110B79-FA4C-4F71-9408-33A55DC5497A@acm.org>

On Oct 8, 2015, at 3:10 PM, cmsmcq@blackmesatech.com wrote:

>> In response to the discussion today, I propose to add the following
>> paragraphs at the end of section 2.10.
>> ...
>> 
>> XSD is designed so that the type annotation of an element can be decided
>> as soon as the start tag of the element is encountered. At this point it
>> is known that the element will either be of a certain type, or it will be
>> invalid. If it turns out to be invalid, then this can always be
>> established by the time the element’s end tag is encountered. To ensure
>> that the XSLT processor never sees invalid data, it is necessary that the
>> schema processor should detect validity errors as early as possible.
> 
> This formulation worries me a little, solely because both the 1.0
> specification
> of the PSVI and the rules for making an XDM instance from a PSVI specify
> that if an element is invalid against its assigned type, it has no type
> annotation
> in the PSVI / XDM instance.  When being pedantic, therefore, one cannot say
> that the type annotation is determined at the start-tag, only that at the
> start-tag we can say which type annotation the element will have, if it has
> any type annotation at all.

I gather that the WG found my comment unclear.  Let me try again.

The first sentence of the quoted paragraph looks false to me.  It says:

>> XSD is designed so that the type annotation of an element can be decided
>> as soon as the start tag of the element is encountered.

But consider the schema denoted by the following schema document, and the
two following sample elements:

  <schema xmlns="http://www.w3.org/2001/XMLSchema"
    targetNamespace="http://example.com/ns1">
    <element name="e" mixed="true">
      <sequence/>
    </element>
  </schema>

  <e>This is example 1.</e>
  <e>This is <e>example 2</e>.</e>

In each example, we know the same thing at the end of the start tag.

But what we know is not what the type annotation of the element will be:
in the PSVI, the first example has a type annotation of {http://example.com/ns1}p
and the second example has no type annotation at all.  We didn't know
that they would have different type annotations at the end of the two
start-tags.  Because contrary to what the sentence says, it is not possible
to decide the type annotation of an element as soon as its start tag is
encountered.

The point I am making is in some ways a pedantic editorial one:  the words
are almost right, but not quite right, and we should fix them so they are right.  
XSD is in fact designed so that SOMETHING important is known as soon as 
the start tag of the element is encountered -- but XSD 1.0 gives it no name 
and uses the obvious name to mean something else.  

Once again, I am embarrassed and apologize for the gratuitous difficulties XSD
places in the way of those who wish to use it or talk about it.  Those of us 
responsible for XSD 1.0 botched many things, and this is one of many ways
in which our failures make life difficult for others.

The term 'type annotation' refers, I think, either to the PSVI's [type definition]
property or to the XDM 'type name' property, which CANNOT be determined
at the time we see the start tag.

If we are willing to use an XSD 1.1 term without explanation, we could 
say that XSD is designed so that the GOVERNING TYPE DEFINITION
of an element can be decided as soon as the start tag of the element is 
encountered.

Michael

Received on Thursday, 15 October 2015 17:51:09 UTC