[Bug 3589] Definitions of "schema document" draft proposal for bugs 2822 and 2846 PSVI and processor profiles

http://www.w3.org/Bugs/Public/show_bug.cgi?id=3589

           Summary: Definitions of "schema document" draft proposal for bugs
                    2822 and 2846 PSVI and processor profiles
           Product: XML Schema
           Version: 1.1 only
          Platform: PC
        OS/Version: Windows XP
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Structures: XSD Part 1
        AssignedTo: cmsmcq@w3.org
        ReportedBy: noah_mendelsohn@us.ibm.com
         QAContact: www-xml-schema-comments@w3.org
                CC: noah_mendelsohn@us.ibm.com


Definitions of "schema document" draft proposal for bugs 2822 and 2846 PSVI and
processor profiles

This is to get into bugzilla a comment I made on the Telcon of 4 Aug 2006.  
The draft at [1] says:

"It is implementation-defined whether a schema processor can read schema
documents in the XML transfer syntax defined here, or in the form of
information sets which correspond to the XML syntax. (See Conformance (§2.4),
which defines "·minimally conforming·" processors as those which cannot read
schema documents in XML form, and "·schema-document aware·" processors as those
which can.)"

My main concern is specifically with the text "schema documents in the XML
transfer syntax defined here", which raises the question of what it means for
something to be "defined" in our recommendation.  My strong preference is that
we reserve the term "defined" for things which are marked up as:"[Definition:]
XXXX".  In the particular case of the term -schema document- we have:

"To provide for this in an appropriate and interoperable way, this
specification provides a normative XML representation for schemas which makes
provision for every kind of schema component. [Definition:]  A document in this
form (i.e. a <schema> element information item) is a schema document. "  I
think that's pretty clear that what we define as a schema document is an
"element information item", and hence an abstract Infoset.   Taking that narrow
view of what it means for something to be defined in our recommendation, I
don't think we define an XML Transfer Syntax for XML Schema Documents.

I think the closest we come is in [3], where we say:

"For interoperability, serialized ·schema documents·, like all other Web
resources, should be identified by URI and retrieved using the standard
mechanisms of the Web (e.g. http, https, etc.) Such documents on the Web must
be part of XML documents (see clause 1.1), and are represented in the standard
XML schema definition form described by layer 2 (that is as <schema> element
information items).

Note:  there will often be times when a schema document will be a complete XML
document whose document element is <schema>. There will be other occasions in
which <schema> items will be contained in other documents, perhaps referenced
using fragment and/or XPointer notation. "

Here I think we're referring to serializations,  but not >defining< anything.  

For the same reason, I'm concerned about the phrase that says:  "or in the form
of information sets which correspond to the XML syntax".  That comes close to
implying that we only define the serialization, but by the way there is a
corresponding Infoset.  For the reasons quoted above, I think the reverse is
true.  The formal definition of schema document is as an infoset, and by the
way there is for each such Infoset a class of schema documents that correspond
(differing, e.g. in whether their attributes use single quotes, the order of
attribute serialization, etc.)

To be clear, I don't object to the spirit of what I think [1] is trying to say,
just to the exact way it's stated.  In the spirit of offering concrete
alternatives when there's a concern, I can think of at least two that would be
fine with me, and I'm sure there are many other simple fixes that would be fine
too:

Alternative 1:

"The exact form in which XML Schema documents are conveyed to a schema
processor is implementation dependent.  In particular, they MAY be read, either
from the Web or from other sources, in the form of XML 1.x serializations,
and/or they MAY be conveyed through other means.   (See Conformance (§2.4),
which defines "·minimally conforming·" processors as those which cannot read
schema documents in XML form, and "·schema-document aware·" processors as those
which can.)"

Alternative 2:

(add a definition and use it)

[DEFINITION:] A -serialized XML Schema Document- is an XML 1.x document
corresponding to an XML -schema document- infoset. 

Then we can use something closer to the original text:

"It is implementation-defined whether a schema processor accepts schema
information in the form of -serialized XML schema documents- and/or in some
other form that conveys the -schema document- Infoset. (See Conformance (§2.4),
which defines "·minimally conforming·" processors as those which cannot read
schema documents in XML form, and "·schema-document aware·" processors as those
which can.)"

A couple of other nits: I think references to schema document should hyperlink
to the definition.  Also, some reference to 4.3.1 might also be helpful, though
I'm less sure about that.  Thanks!

Noah

[1]
http://www.w3.org/XML/Group/2004/06/xmlschema-1/structures.rq144.200607.html#infoset
[2]
http://www.w3.org/XML/Group/2004/06/xmlschema-1/structures.rq144.200607.html#key-schemaDoc
[3]
http://www.w3.org/XML/Group/2004/06/xmlschema-1/structures.rq144.200607.html#schema-repr

Received on Tuesday, 8 August 2006 22:21:18 UTC