Addressing the QName to URI Mapping Problem

This posting is a distillation of discussions relating to an issue that 
has been under discussion for quite some time, primarily on the 
www-rdf-interest list but also a bit on the www-rdf-logic list. 

As it was pointed out that discussions on those lists are not necessarily
officially addressed by the RDF core working group, I am therefore taking
this opportunity to (re)address this issue to the core working group
specifically.

Please see the following postings and their associated threads for the
complete
and original content of the discussions:

[1] "Summary of the QName to URI Mapping Problem"
    http://lists.w3.org/Archives/Public/www-rdf-interest/2001Aug/0061.html

[2] "A Wishlist of QName::URI Mapping Examples"
    http://lists.w3.org/Archives/Public/www-rdf-interest/2001Aug/0117.html

[3] "Dedicated, Standardized URI Scheme for QNames?"
    http://lists.w3.org/Archives/Public/www-rdf-interest/2001Aug/0134.html

[4] "A proposed solution to the RDF syntactic/semantic mapping problem
(long)"
    http://lists.w3.org/Archives/Public/www-rdf-interest/2001Jun/0151.html

The following references are also relevant to the following discussion:

[5] "XML Namespaces", James Clark
    http://jclark.com/xml/xmlns.htm

[6] "QNames As Anonymous First Class Objects", Sean B. Palmer 
    http://lists.w3.org/Archives/Public/www-rdf-interest/2001Jun/0216.html

[7] "Issue rdfms-qname-uri-mapping: The mapping of QNames to URI's generates
incorrect URI's."
    http://www.w3.org/2000/03/rdf-tracking/#rdfms-qname-uri-mapping


=== Claims ===

The following claims have been IMO fully discussed and demonstrated as
valid in the above referenced materials. I will not duplicate that
discussion
here. 

Claim 1: An XML QName functions as a universal identifier.

Claim 2: An XML QName is not a URI.

Claim 3: Direct suffixation of name to namespace URI can result
         in ambiguity and therefore the integrity of knowledge
         is not preserved.

Claim 4: Direct suffixation of name to namespace does not preserve
         the uniqueness of QNames and therefore is in violation of 
         required behavior defined by the XML Namespaces specification.

Claim 5: Direct suffixation is not compatible with all possible URI 
         schemes and therefore introduces an unreasonable discrimination
         against some forms of namespace URIs.

=== Conclusion ===

RDF must adopt an alternate method of deriving a resource URI from an
XML QName than direct suffixation of name to namespace URI, as is now 
employed.

=== Proposal 1: Official URI Scheme for QNames ===

Possible syntax for a QName URI scheme (adopted from James Clark):

   'qn' ':' '{' <namespace URI> '}' <name>

e.g. 

   qn:{http://www.purl.org/dc/elements/1.1/}date
   qn:{urn:partax:(foo)}bar
   qn:{http://www.w3.org/2000/01/rdf-schema#}subPropertyOf

Thus, a QName would officially have two fully equivalent representations:

1. As an element or attribute name within a serialized instance:

   xmlns:xx="namespace"
   xx:name

2. As a URI:

   qn:{namespace}name

Presuming RDF adopts the practice of XML Schema to support QNames as 
resource attribute values, then one could use QNames as a universal naming 
scheme with consistent shorthand notation in all RDF serialized contexts 
where URIs can occur, and furthermore achieve consistency of representation
(using QNames as universal identifiers) both in the knowledge base as well 
as in all serializations. And furthermore, there would be a well defined, 
reliable, bidirectional mapping between QNames in XML serializations and 
QName URIs in RDF graphs.

A QName then becomes a truly first-class object on the SW, used as the
primary mechanism for universal naming.

Every RDF parser would map an XML QName to a qn: URI and therfore there
would never occur any collisions, every RDF application would get the
same triples, and every resource would be identified by a fully valid URI,
and mapping back to QNames for re-serialization would be trivial and
consistent.

Any equivalences of QNames to non-qn-URIs could be defined with
mechanisms such as daml:equivalentTo or daml:subPropertyOf
(or rdfs:subPropertyOf).

e.g. 

  <qn:{urn:partax:(foo)}bar>  daml:equivalentTo  <urn:partax:(foo(bar))> .

Also, serializations allowing QNames in attribute values of type URI
would greatly simplify human creation (or inspection) of XML instances:

I.e. the following minimally verbose RDF XML fragment

  xmlns:mars="http://metia.nokia.com/MARS/2.1"
  xmlns:lang="http://www.iso.ch/3166-1"
  ...
      <mars:language rdf:resource="lang:en" />
  ...

gives us the triple

  [X, qn:{http://metia.nokia.com/MARS/2.1}language,
         qn:{http://www.iso.ch/3166-1}en]

Thus, RDF instances and schemas would be far easier to write by
hand, since it would be expected that many abstract resources would be 
defined using QNames, and therefore prefixes can be defined and used
anywhere
any resource might be referenced or defined in an RDF or RDF Schema
XML instance.

=== Proposal 2: Explicit Mapping ===

RDF could have an explicit mapping defined between a QName and a URI.

I use the namespace prefix 'rdfm:' to correspond to a namespace defining
an ontology for RDF serialization mapping constructs, but don't yet bother
to specify any actual namespace URI (since it is fictional). I also don't
vouch for the correctness or existence of any of the URIs in any of the 
examples (they're just examples)...

Each example provides one or more mapping declarations, a sample XML RDF 
fragment, and the equivalent triple(s).

Note that all mappings are fully bi-directional, and thus can be used
for parsing serialized XML data into triples and serializing triples back
to XML.

The following is a brief summary of a possible ontology:

    rdfm:Map              a mapping declaration
    rdfm:resource         the URI of an RDF resource
    rdfm:namespace        the URI of a namespace
    rdfm:name             a namespace qualified name
    rdfm:value            a literal CDATA value in the XML instance
    rdfm:property         the URI of an RDF property serving as context for
                             literal CDATA values in the XML instance
    rdfm:pattern          a regular expression pattern matching CDATA
                             in the XML instance

1. QName <-> URI

   <rdfm:Map
    rdfm:resource  ="urn:partax:(foo(bar))"
    rdfm:namespace ="urn:partax:(foo)"
    rdfm:name      ="bar"
   />

   <rdf:RDF ... xmlns:foo="urn:partax:(foo)">
      <rdf:Description ID="X">
         <foo:bar>bas</foo:bar>
      </rdf:Description>
   </rdf:RDF>

   [X, urn:partax:(foo(bar)), "bas"]

This basic example addresses the most essential need of a mapping
solution that (a) does not create unintentional collisions, and (b)
works with any arbitrary URI scheme. The remaining examples are not
manditory but could be considered as highly desirable.

2. CDATA Literal <-> URI

   <rdfm:Map
    rdfm:resource  ="name:metia.nokia.com/MARS/2.1/language"
    rdfm:namespace ="name:metia.nokia.com/MARS/2.1"
    rdfm:name      ="language"
   />
   <rdfm:Map
    rdfm:resource ="http://www.iso.ch/3166-1/en"
    rdfm:property ="http://metia.nokia.com/MARS/2.1/language"
    rdfm:value    ="en"
   />

  <rdf:RDF ... xmlns:mars="http://metia.nokia.com/MARS/2.1">
      <rdf:Description ID="X">
         <mars:language>en</mars:language>
      </rdf:Description>
   </rdf:RDF>

   [X, http://metia.nokia.com/MARS/2.1/language,
       http://www.iso.ch/3166-1/en]

IMO, it's better to have an actual resource http://www.iso.ch/3166-1/en 
identified in our knowledge base than a literal such as 'en'. 

3. CDATA Literal <-> RDF Literal with Validation

   <rdfm:Map
    rdfm:resource  ="name:metia.nokia.com/MARS/2.1/date"
    rdfm:namespace ="name:metia.nokia.com/MARS/2.1"
    rdfm:name      ="date"
   />   
   <rdfm:Map
    rdfm:property ="http://metia.nokia.com/MARS/2.1/date"
    rdfm:pattern  ="[0-9]{4}-[0-9]{2}-[0-9]{2}"
   />

  <rdf:RDF ... xmlns:mars="http://metia.nokia.com/MARS/2.1">
      <rdf:Description ID="X">
         <mars:date>2001-08-15</mars:date>
      </rdf:Description>
      <rdf:Description ID="Y">
         <mars:date>01-8-15</mars:date>
      </rdf:Description>
   </rdf:RDF>

   [X, http://metia.nokia.com/MARS/2.1/language, '2001-08-15']
   [Y, http://metia.nokia.com/MARS/2.1/language, 
              ??? ERROR! Not a valid date!


4. CDATA Literal <-> URI by Pattern

   <rdfm:Map
    rdfm:resource  ="name:metia.nokia.com/MARS/2.1/language"
    rdfm:namespace ="name:metia.nokia.com/MARS/2.1"
    rdfm:name      ="language"
   />
   <rdfm:Map
    rdfm:resource ="http://www.iso.ch/3166-1/en"
    rdfm:property ="http://metia.nokia.com/MARS/2.1/language"
    rdfm:pattern  ="en(-[a-z][a-z])?"
   />

  <rdf:RDF ... xmlns:mars="http://metia.nokia.com/MARS/2.1">
      <rdf:Description ID="X">
         <mars:language>en-us</mars:language>
      </rdf:Description>
     <rdf:Description ID="Y">
         <mars:language>en</mars:language>
      </rdf:Description>
   </rdf:RDF>

   [X, http://metia.nokia.com/MARS/2.1/language,
       http://www.iso.ch/3166-1/en]
   [Y, http://metia.nokia.com/MARS/2.1/language,
       http://www.iso.ch/3166-1/en] 

Note that regardless of the optional dialect code, they are mapped to
the same language resource.

5. QName synonyms -> common URI

   <rdfm:Map
    rdfm:resource  ="name:metia.nokia.com/MARS/2.1/language"
    rdfm:namespace ="name:metia.nokia.com/MARS/2.1"
    rdfm:name      ="language"
   />
   <rdfm:Map
    rdfm:resource  ="name:metia.nokia.com/MARS/2.1/language"
    rdfm:namespace ="name:metia.nokia.com/MARS/2.1/sf"
    rdfm:name      ="kieli"
   />
   <rdfm:Map
    rdfm:resource ="http://www.iso.ch/3166-1/en"
    rdfm:property ="http://metia.nokia.com/MARS/2.1/language"
    rdfm:value    ="en"
   />

  <rdf:RDF ... xmlns:mars="http://metia.nokia.com/MARS/2.1"
               xmlns:mars-sf="http://metia.nokia.com/MARS/2.1/sf">
      <rdf:Description ID="X">
         <mars:language>en</mars:language>
      </rdf:Description>
      <rdf:Description ID="Y">
         <mars-sf:kieli>en</mars:kieli>
      </rdf:Description>
   </rdf:RDF>

   [X, http://metia.nokia.com/MARS/2.1/language,
       http://www.iso.ch/3166-1/en]
   [Y, http://metia.nokia.com/MARS/2.1/language,
       http://www.iso.ch/3166-1/en]

And why not just deal with cases like 4 and 5 using e.g. daml:equivalentTo?
In order to reduce what is already a highly complex inference space by
eliminating what are (to the perspective of a given application using such 
mappings) irrelevant syntactical variants or localized vocabularies of 
equivalent semantic identities (resources).

--

The first proposal above is not backwards compatible with RDF 1.0, whereas
the second proposal is fully backwards compatible; however, IMO if
combined with support for QNames a resource attribute values, the first
proposal constitutes a better solution overall than the second proposal,
and moves RDF into closer syncronization with the XML Core standards.

I hope that the above discussion has been clear and to the point, and that
it can serve as a basis for productive discussion which will result in a
satisfactory resolution to this issue.

Regards,

Patrick

--
Patrick Stickler                      Phone:  +358 3 356 0209
Senior Research Scientist             Mobile: +358 50 483 9453
Software Technology Laboratory        Fax:    +358 7180 35409
Nokia Research Center                 Video:  +358 3 356 0209 / 4227
Visiokatu 1, 33720 Tampere, Finland   Email:  patrick.stickler@nokia.com
 

Received on Tuesday, 21 August 2001 05:04:07 UTC