- From: David Beech <dbeech@us.oracle.com>
- Date: Thu, 25 May 2000 19:08:09 -0700
- To: www-xml-schema-comments@w3.org
Noah_Mendelsohn@lotus.com wrote: > >> [Lauren Wood wrote:] there is no infoset defined for the schema Fortunately, there is. One of the reasons, perhaps the main reason, that we elected to use XML syntax for XML Schema was surely so that generic XML technology and tools are applicable to schema documents. Hence a well-formed schema document will have an infoset, and if validated against the Schema for Schemas it will have the additions of a PSV-infoset. I have been thinking about this for a while now, because of its relevance to XML Query or to anyone who needs to extract information from a schema document. Humans at least are likely to have a mental picture of the XML representation of a schema document when they think about querying it. And it is very helpful to query "data" documents and "schema" or "metadata" documents in the same way, using the common abstract model of the infoset for all of them. This is exactly how database schemas are queried via tables and views, just like data tables and views, and sharing the same abstract relational model. The suspense may have been building up to this point, since I appear to be heading for a fall, having overlooked the fact that the question was about the infoset for a "schema" and not for a "schema document". However, I just wanted to be clear about schema documents first, and then say that if multiple schema documents are used to form a "schema", then the set of their infosets provides the infoset information for the whole "schema". [I fear that the fact that a "schema" is not in general to be identified with a <schema> is already confusing enough to our readers. I would be really opposed to that confusion spreading to having two different kinds of infoset for them, two different kinds of DOM, two different kinds of Query, etc. - if it is at all possible to avoid this. I believe it is.] My hypothesis is that it should be possible to express the additional information that is discovered about a schema during validation, and that shows up in what we currently describe as a different kind of "component and properties" model, as further additions to the PSV-infoset for a schema document. We would then have a PSDV-infoset (i.e. a Post-SchemaDocument-Validation-infoset), which is an expanded PSV-infoset when the instance being validated happens to be a schema document. In that way, DOM and Query and others don't have to deal with two different Infoset models, but can just extend gracefully (I just had to correct a Freudian slip - "gratefully"). This is only a hypothesis because I haven't had time to check details, and to see what of the component information is local to a schema document and what requires assembly of a "schema". This last point is also interesting from another angle - I recall seeing a comment recently from someone who would like to have schema documents validated per se before being pressed into service. As I understand it, our Structures spec only validates schema documents (Constraints on Schemas, etc.) when using them to validate some instance document . The difference may be only slight, but we might find one or two small things if we tried to separate out "schema document validation" like that. > > > I am increasingly intrigued by the notion (which I have mentioned > privately to one or two members of the workgroup) that we should rename our > schema components "element declaration information item", "complex type > definition item", etc.. We have gone to great lengths to define the analog > of infoset for schemas, and it is obvious that there is confusion about > what we have done. Maybe I'm too optimistic, but in the light of the above, might it be possible to avoid having both an infoset and an analog of it? e.g. the EII in the Infoset for an <element> declaration in a schema document already contains much of the information so could we just add to it rather than create this other overlapping EDII? I don't see it as being much different to work with, and the simplification of staying within one infoset model would be rewarding. Regards, David
Received on Thursday, 25 May 2000 22:09:34 UTC