W3C home > Mailing lists > Public > xmlschema-dev@w3.org > October 2020

Re: XSD validation, ambiguous root XML instance element

From: C. M. Sperberg-McQueen <cmsmcq@blackmesatech.com>
Date: Sat, 31 Oct 2020 11:49:19 -0600
Cc: "C. M. Sperberg-McQueen" <cmsmcq@blackmesatech.com>, "xmlschema-dev@w3.org" <xmlschema-dev@w3.org>
Message-Id: <FB015971-BC4E-473A-9799-971A3D6E1035@blackmesatech.com>
To: Mukul Gandhi <gandhi.mukul@gmail.com>

> On 31,Oct2020, at 5:48 AM, Mukul Gandhi <gandhi.mukul@gmail.com> wrote:
> ...
> Due to an xs:include that I'm using, within the XSD document x.xsd, the valid XML instance root elements can be X, p or q. But, I wish that, the XML schema for my example above, should prohibit the XML instances root elements p or q. I wish that, only an XML instance element X should be a valid XML instance root. Is this achievable, with the current XSD 1.0 or 1.1 languages?

It is not achievable from inside the schema document except by making everything else local, which makes re-use impossible.  

As I understand the prose in the description of validation in section 5.2, the spec expects that specifying the name of the root element should be a run-time option in validators that support it, like the choice of what to do if no element declaration is found.  But I don’t have the impression that most implementors got the message.

> If not, I'd like to suggest following changes to my XSD document x.xsd,
> ...
> i.e, I wish the following change to XSD language,
> We should be able to specify an optional xs:root element as child of xs:schema element, that can mention XML schema validation constraints for the XML instance root element. With the above XSD document example, xs:root element specifies that, XML elements p and q cannot be the root XML elements of the XML document that's validated.
> Any thoughts please?   

The assertions contained in your xs:root element — they are to be applied on the validation root of any schema validity assessment?  Or on the root node of the document?  

Judging by your example, not the root node of the document:  if it were the root node (in the XDM sense) or the document information item, your test not(self::p or self::q) would always succeed.  

But if it’s the root node of the validation episode, you seem to be making a very strong assumption about my environment and my goals in using your schema to validate my data.  Do you really want to say, as part of the schema, that no user is ever allowed to want to validate a single ‘q’ element in a document?  That seems a very odd division of labor between schema creator and schema user.  In other respects, it is the user who controls validation:  choice of validator, choice of schema, choice of the time at which to invoke the validator, choice of run-time options, including (as 5.2 describes validation) the choice of what subtree of the input should be validated and the choice of what element declaration, attribute declaration, or type definition should be used to validate that node.

Since implementors seem to have mostly ignored the implications of 5.2 for their interfaces, I think a case can be made that it would be more convenient for the common case, if the spec did allow a schema to say something about the documents the schema is intended for, when a schema is intended to apply to whole documents.  But unless things are very carefully defined, any such provisions are likely to get in the way of users using a schema to validate portions of XML documents, and they will certainly complicate the rules for things like include, import, redefine, and override.

A different solution to the way implementors have ignored section 5.2 would be for information about how validation can in principle be invoked to be easier to find — not buried in a section so inconspicuous that almost all readers overlook it — and for the spec to be a little more explicit about the implications of that information for implementations.   But that would require (a) a rather different editorial approach from the one in the current version of the XSD spec, and (b) a greater tolerance in the WG for actually saying things in the spec which attempt to set expectations about software behavior.

I hope this helps.


C. M. Sperberg-McQueen
Black Mesa Technologies LLC
Received on Saturday, 31 October 2020 17:49:39 UTC

This archive was generated by hypermail 2.4.0 : Saturday, 31 October 2020 17:49:41 UTC