locating schemas via public ids

During this week's telcon of the OASIS Entity Resolution Technical 
Committee (OERTC) [1], we discussed what is recorded in our issues
document [2] as "Issue 13 - Public identifiers for schema locations",
and I was tasked [3] with sending our comment to the XML Schema group
and the XML CG.  (Since OASIS is both a liaison group to the W3C as
well as a W3C member, we figured it would be safest to include the 
XML CG in this loop.)  Our comment follows.

----------------------------------------------------------------
The OASIS Entity Resolution Technical Committee (OERTC) [1] is
chartered to develop an entity resolution catalog format in XML
(XML Catalog or xmlcat).  The purpose and functionality of this 
catalog format is to cover that which the SGML Open/OASIS TR9401 [4] 
Entity Management Catalog did, but using XML instance syntax and 
tailored for use with XML.

Many implementors and users have found public ids and entity
management catalogs to be very useful in practical situations
ranging from individual use to major production environments, and
there is a desire to be able to use such techniques for accessing 
XML resources, especially "public" resources such as published
DTDs and Schemas.  The OERTC has the support of several implementors,
and its work has received interest from the xml-dev community.

During our work, we realized that the current Schema Structures
draft appears to make it impossible to provide schema-locating
hints using anything other than URIs.  Specifically, public
identifiers [5] could not be used as the spec is currently written.

In Structures, 6.3.2 How schema definitions are located on the Web [6], 
it says:
  [xsi:schemaLocation] records the author's warrant with pairs of
  URI references (one for the namespace URI, and one for a hint as
  to the location of a schema document defining names for that
  namespace URI). [xsi:noNamespaceSchemaLocation] similarly provides
  a URI reference as a hint as to the location of a schema document
  with no targetNamespace.

The problem is that each member of schemaLocation and the value of
noNamespaceSchemaLocation is required to be a URI.  Furthermore,
the members are undelimited and separated by spaces, and public
identifiers can contain spaces.

The XML Catalog (as did TR9401 before it) would allow a user
to locate a resource using all the information that might be 
known about it (name, system id, public id), and certainly schema 
resources will be given public ids (several have already [7]).  But 
this only works if there is some way to include that information in 
schemaLocation.  Specifically, the second "member" of each "pair"
in schemaLocation--and the value of noNamespaceSchemaLocation--should 
be able to contain all the external identifier information representable 
in XML 1.0 production 75 [8], not just the information in the first half 
of that production as is currently the case.

Therefore, the OERTC asks that the XML Schema WG make allowances
in schemaLocation for specifying both public and system ids.  
Though we realize that XML Schemas is nearly ready to go to PR, 
we feel this is a crucial need that should be addressed in 
the 1.0 version of XML Schemas, and we hope that the WG can find 
some way to add this capability.

Paul Grosso for 
 Lauren Wood, chair of the OASIS Entity Resolution Technical Committee,
 Laura Walker, OASIS AC representative, and
 the OASIS Entity Resolution Technical Committee

[1] http://www.oasis-open.org/committees/entity/
[2] http://www.oasis-open.org/committees/entity/issues.html
[3] http://lists.oasis-open.org/archives/entity-resolution/200101/msg00024.html
    (Issue 8 in the minutes)
[4] http://www.oasis-open.org/committees/entity/9401.html
[5] http://www.w3.org/TR/REC-xml#NT-PubidLiteral
[6] http://www.w3.org/TR/2000/CR-xmlschema-1-20001024/#schema-loc
[7] http://www.oasis-open.org/committees/entity/ident.html#schema
[8] http://www.w3.org/TR/REC-xml#NT-ExternalID

Received on Thursday, 25 January 2001 12:45:51 UTC