- From: Henry S. Thompson <ht@cogsci.ed.ac.uk>
- Date: 31 Dec 1999 15:32:15 +0000
- To: Roger Costello <costello@mitre.org>
- Cc: xml-dev@ic.ac.uk, www-xml-schema-comments@w3c.org, "Schneider,John C." <jcs@mitre.org>, "Cokus,Michael S." <msc@mitre.org>
Roger Costello <costello@mitre.org> writes: > Hi Folks, > > I have a couple of questions with regards to the use of namespaces in > XML Schemas. > > 1. As has been recently discussed, the method for an XML instance > document to indicate the XML Schema that it conforms to is with the > schemaLocation attribute. For example: > > <?xml version="1.0"?> > <BookCatalogue xmlns:xsi="http://www.w3.org/1999/XMLSchema/instance" xmlns="http://www.somewhere.org/BookCatalogue" > xsi:schemaLocation= > "http://www.somewhere.org/BookCatalogue > > http://www.somewhere.org/BookCatalogue/BookCatalogue.xsd"> > ... > </BookCatalogue> > > At the root element (BookCatalogue) of this XML instance document I am > using schemaLocation to indicate the XML Schema that it conforms to. > > The problem is this: when I defined BookCatalogue (in BookCatalogue.xsd) > I didn't define any attributes for it. I certainly didn't define > xmlns:xsi nor xsi:schmemaLocation as attributes. Thus, this XML > instance document is invalid, right? No, it's fine. Note it has no DTD, so validity (an XML 1.0 concept) is not relevant. Defining a DTD for it, which appropriately allowed for namespace prefixes, would be possible but tedious. It's SCHEMA-valid (or at least it's not obviously NOT schema-valid, given the addition of a default namespace declaration as above) because a) xmlns:xsi and xmlns are not attributes, they are namespace declarations, and they're just fine as such: no declarations for them are required in BookCatalogue.xsd; b) xsi:schemaLocation is an attribute, but by definition such an attribute is always schema-valid provided its contents are coherent, which they are in this case. > The nice thing about DOCTYPE was that it separated the mechanism for > declaring the associated schema (i.e., the DTD) from the information > items (i.e., the elements). With schemaLocation the mechanism for > declaring the associated schema is intertwined with the information > items. > > Thus, it seems that when an XML Schema is written the author must try to > anticipate how instance documents will use it and add in xmlns:xsi and > xsi:schemaLocation attributes to the elements being defined in the > schema. For my example, I would need to define BookCatalogue as: > > <element name="BookCatalogue"> > <type> > <element ref="cat:Book" minOccurs="0" maxOccurs="*"/> > <attribute name="xmlns:xsi" type="URI"/> > <attribute name="xsi:schemaLocation" type="string"/> > </type> > </element> > > I must be misunderstanding something fundamental. This is obviously > ridiculous. I hope the comments above clarify that you don't need either of those attribute declarations. > 2. My second question has to do with referencing elements within an XML > Schema. Consider this schema: > > <?xml version="1.0"?> > <!DOCTYPE schema SYSTEM "xml-schema.dtd"[ > <!ATTLIST schema xmlns:cat CDATA #IMPLIED> > ]> > <schema xmlns="http://www.w3.org/1999/XMLSchema" > targetNamespace="http://www.somewhere.org/BookCatalogue" > xmlns:cat="http://www.somewhere.org/BookCatalogue"> > <element name="BookCatalogue"> > <type> > <element ref="cat:Book" minOccurs="0" maxOccurs="*"/> > </type> > </element> > <element name="Book"> > <type> > <element ref="cat:Title"/> > <element ref="cat:Author"/> > <element ref="cat:Date"/> > <element ref="cat:ISBN"/> > <element ref="cat:Publisher"/> > </type> > </element> > <element name="Title" type="string"/> > <element name="Author" type="string"/> > <element name="Date" type="date"/> > <element name="ISBN" type="string"/> > <element name="Publisher" type="string"/> > </schema> > > Note that we define the Book element and in the BookCatalogue element it > is referenced using cat:Book > > <element name="BookCatalogue"> > <type> > <element ref="cat:Book" minOccurs="0" maxOccurs="*"/> > </type> > </element> > > My understanding is that the reason for prefixing Book with cat: is to > indicate "the Book element that we are referencing comes from the cat: > namespace". The cat: namespace is defined at the top of the schema to > be the same as the targetNamespace. Thus, the cat: namespace refers to > this schema document. I'd say, more carefully: "The prefix cat denotes a namespace URI which is the same as the namespace URI identifying the target namespace of this schema. Thus references to schema components in that namespace refer to components defined in this schema." > Here's my question: it appears to me that namespaces are being used > here to "point" to things. In this case, cat: is "pointing" to the > current document (the XML Schema). Isn't this a violation of the > namespace spec, which says that there is no guarantee that there is > anything at the URI referenced by a namespace? The fact that you can't depend on dereferencing a namespace URI is fundamental to our design. I hope the above gloss helps clarify that we're not cheating here. It may be helpful to consider the intermediate case of the <import> concept. Here are some excerpts from the schema for schemas, but they could be from BookCatalogue.xsd: <schema xmlns="http://www.w3.org/1999/XMLSchema" targetNamespace="http://www.w3.org/1999/XMLSchema" xmlns:x="http://www.w3.org/XML/1998/namespace"> <import namespace="http://www.w3.org/XML/1998/namespace" schemaLocation="http://www.w3.org/XML/1998/xml.xsd"/> <element name="info"> <type content="mixed"> <any minOccurs="0" maxOccurs="*"/> <attribute name="source" type="uri"/> <attributeGroup ref="x:lang"/> </type> </element> </schema> The <attributeGroup> element references a group named 'lang' in a namespace with the namespace URI "http://www.w3.org/XML/1998/namespace", which we recognise as the namespace for XML itself. The import statement tells us we can find a schema for the namespace with that namespace URI at http://www.w3.org/XML/1998/xml.xsd, and indeed if you look there you will find a schema with a declaration of an attributeGroup named 'lang'. In other words, <import> establishes the connection between a namespace URI used in explicit schema references and a schema which discharges those references, in much the same way that 'xsi:schemaLocation' establishes the connection between the namespace URI used in IMPLICIT schema references in an instance and a schema which discharges them. To close the conceptual loop, you can think of the 'targetNamespace' attribute on a schema as providing the wherewithall for an implied <import> statement, e.g. <import namespace="http://www.somewhere.org/BookCatalogue" schemaLocation=""/> This is just what is meant by saying that every schema is taken to be defining components in its target namespace. Hope this helps, ht Note I've tried to be careful to distinguish four things in my answers above: namespaces; schemas; namespace URIs; prefixes. Although doing this makes things more prolix, it avoids misunderstandings, and I commend it to you in messages on this topic. -- Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh 2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440 Fax: (44) 131 650-4587, e-mail: ht@cogsci.ed.ac.uk URL: http://www.ltg.ed.ac.uk/~ht/
Received on Friday, 31 December 1999 10:32:19 UTC