W3C home > Mailing lists > Public > www-rdf-interest@w3.org > June 2001

What to do about namespace derived URI refs... (long)

From: <Patrick.Stickler@nokia.com>
Date: Wed, 6 Jun 2001 12:13:41 +0300
Message-ID: <6D1A8E7871B9D211B3B00008C7490AA50795870E@treis03nok>
To: www-rdf-interest@w3.org
Cc: Ora.Lassila@nokia.com

Hey folks,

I've been thinking alot about this namespace URI reference issue that
an inherent incompatibility between XML Schema and RDF, and have been
various discussions here and there, and would like to share some thoughts on
the matter for discussion and offer some proposals towards a solution.

The discussions about whether to concatenate fragment refs with namespace
URIs with
or without intermediate punctuation, such as '#', seems to me to miss the
whole point of
the problem. Modifying the RDF spec to have an algorithm for concatenation
that syncs 
with that of XML Schema is simply treating the symptom, not curing the

The root of the problem is that, even though namespaces use URI's to achieve
a set
of unique identifiers, which then serve as prefixes for names to in turn
achieve a set
of unique names for a global scope -- the fact is that namespace URIs are
not expected
nor required to resolve to any actual data stream nor, if they do correspond
to a data
stream, are they required to resolve to the same MIME content type for all
URIs. Since URI reference fragment identifiers are tied to a specific MIME
content type,
and as namespaces are not,*and* because a given namespace might have
in a number of different MIME content types (DTD, XML Schema, RDF Schema, or
any other arbitrary schema encoding)  there cannot be any single,
consistent, reliable 
algorithm for deriving  the correct URI reference of a name defined within
some namespace
as any such URI reference will be tied to one of possibly many definitions
based on
that namespace and thus not representative of the abstract namespace itself.

Furthermore, because the fragment reference syntax for different MIME
content types
vary (e.g. the latest XML Schema spec vs. XML/RDF, etc.) it is to be
expected that
URI references to the definitions of named resources within schemas will
vary from
schema encoding to schema encoding -- and thus be unnable to address the
that despite the different schema content types, we are talking about the

This confusion has apparently arisen from the (unfortunate) use of HTTP URIs
namespace URIs. Although namespace URIs are themselves not expected to
to a content stream, URLs *are* (that's what makes them URLs!) and an HTTP
is a URL and therefore IMO it is an error if it does *not* resolve to a
stream. Note that the error is not that the namespace does not resolve to a 
content stream, but that the HTTP URL used to define the namespace does not.
However, since the vocabulary/ontology corresponding to a given namespace
can be defined by numerous schema encodings (and might have several in use),
cannot share a common HTTP URI namespace prefix with all schema encodings as
may have incompatible URI fragment syntax due to being different MIME
content types!

IMO, what is needed to solve this mess is an explicit and standardized
notation for
global universal identifiers based on a mechanism such as a URN scheme which
for the global specification of vocabularies/taxonomies which can be used as
the basis 
of common reference in various schemas and applications based on those
The root or partial prefix of instances of such a URN scheme would serve as
namespace prefix and below that would define the vocabulary terms,
arranged. There would then simply need to be a mapping from this single,
notation to/from the various MIME content types such as XML, XML DTD, XML
Schema, RDF,
etc., but this would be explicit and regular. 

A proposal for discussion: Hierarchical Resource Names URN scheme

(the following is provided as a rough example for discussion only, please no
 nits about minor flaws, etc. there are surely errors and shortcomings, as
 will always be the case in contexts of high caffiene and sleep depravation

HRN = urn:hrn:<authority>/<path>
authority = (<rfc2732 host> | <user>)
user = <rfc2396 userinfo>@<rfc2732 host>
path = (<name> (/<name>)*)
name = /[a-zA-Z0-9]([-_.]?[a-zA-Z0-9])*/

E.g. (examples based on MARS metadata ontology)

urn:hrn:metia.nokia.com/MARS/2.1                 ;MARS 2.1 Vocabulary
urn:hrn:metia.nokia.com/MARS/2.1/coverage        ;MARS 'coverage' property
urn:hrn:metia.nokia.com/MARS/2.1/coverage/fi     ;MARS 'coverage' property
value 'fi' (Finland)
urn:hrn:metia.nokia.com/MARS/2.1/language        ;MARS 'language' property
urn:hrn:metia.nokia.com/MARS/2.1/language/fi     ;MARS 'language' property
value 'fi' (Finnish)
urn:hrn:metia.nokia.com/MARS/2.1/status          ;MARS 'status' property
urn:hrn:metia.nokia.com/MARS/2.1/status/draft    ;MARS 'status' property
value 'draft'
urn:hrn:metia.nokia.com/MARS/2.1/status/approved ;MARS 'status' property
value 'approved'
urn:hrn:metia.nokia.com/MARS/2.1/status/retired  ;MARS 'status' property
value 'retired'



* The property values 'coverage/fi' and 'language/fi' are not the same
concept/resource, even though they have the same ISO defined name. One is a 
country, the other a language. Thus, if we are to assign e.g. labels or 
other properties and relations for these resources for various 
languages/regions, we must be able to differentiate between them

* By requiring that the authority be a valid host or email address according
RFC 2397 and 2732, , the issue of registering authority identifiers is
avoided as
the registries for internet domain names and address spaces as well as
per-domain, per-server user management can be utilized. It further serves
to ground the resource identities in known web resources.

* By allowing the authority to be not only a host but a user, an individual
is able to define and publish personal ontologies without having to first
secure a domain name, etc.

For RDF/RDF Schema/DAML/etc., one would simply use the HRN URNs in all 
statements. E.g.:


<Property      rdf:ID       ="urn:hrn:metia.nokia.com/MARS/2.1/status">
   <rdf:label  rdf:value    ="Status" xml:lang="en"/>
   <rdfs:range rdf:resource ="#Status"/>
   <count      rdf:resource ="#Single"/>
   <range      rdf:resource ="#Bounded"/>
   <ranking    rdf:resource ="#Strict"/>
   <default    rdf:resource

<rdf:Class rdf:ID="Status" .../>

<Status rdf:ID="urn:hrn:metia.nokia.com/MARS/2.1/status/draft">
   <rdf:label rdf:value="Draft" xml:lang="en"/>
   <rank      rdf:value="1"/>

<Status rdf:ID="urn:hrn:metia.nokia.com/MARS/2.1/status/approved">
   <rdf:label rdf:value="Approved" xml:lang="en"/>
   <rank      rdf:value="2"/>


In an XML Schema, one would use part of the HRN URN path as a namespace URI,

and define the mapping of element/attribute names from the XML Schema
to the HRN URN representation. E.g.

<schema ...


<!-- urn:hrn:metia.nokia.com/MARS/2.1/status -->
<element name="status" substitutionGroup="mars:property">
   <complexType base="mars:Property" derivedBy="restriction">
      <simpleType base="mars:TokenString">
            <!-- urn:hrn:metia.nokia.com/MARS/2.1/status/draft -->
            <enumeration value="draft"/>
            <!-- urn:hrn:metia.nokia.com/MARS/2.1/status/draft_approved -->
            <enumeration value="draft_approved"/>
            <!-- urn:hrn:metia.nokia.com/MARS/2.1/status/approved -->
            <enumeration value="approved"/>
            <!-- urn:hrn:metia.nokia.com/MARS/2.1/status/retired -->
            <enumeration value="retired"/>

Presuming that the above XML Schema is being used to parse/validate
the following content: 

then what remains to be resolved is how the literal 'approved' accoding
to the serialization schema is associated with the HRN URN 
"urn:hrn:metia.nokia.com/MARS/2.1/status/approved", etc. so that
the RDF statements above regarding label, rank, etc. apply.

I.e., without such a mapping, we get the triple:

   ("...", "urn:hrn:metia.nokia.com/MARS/2.1/status", "approved")

but what we need/want is:

   ("...", "urn:hrn:metia.nokia.com/MARS/2.1/status", 

It would be *really* icky (for lack of a more technical term ;-) to 
have to define the XML Schema as follows, simply to achieve a reliable
and explicit intersection between the XML Schema, XML serialized instance,
and RDF Schema... 

<!-- urn:hrn:metia.nokia.com/MARS/2.1/status -->
<element name="status" substitutionGroup="mars:property">
   <complexType base="mars:Property" derivedBy="restriction">
      <simpleType base="mars:HRN">

and have to encode the serialization as:




An alternate approach would be to use empty elements to represent
members of controlled value sets, e.g.


but as the value name set of each property having a controlled value set
and the property name set itself should correspond to different namespaces,
one must resort to separate XML Schema definitions for each property value
set, which is cumbersome, both for specification and for markup.

As it is common to use simple enumerations of controlled value sets (e.g.
xml:lang taking an ISO-639 value, etc.) there needs to be, in addition to
the schema encoding neutral identity of such values, a consistent way to
map to that identity from their literal representations, based on the schema
defining the serialization. 

One possible solution would be to permit a targetNamespace attribute to
be specified for enumeration declarations which would define the namespace 
to which the literal name value belongs. E.g.

<!-- urn:hrn:metia.nokia.com/MARS/2.1/status -->
<element name="status" substitutionGroup="mars:property">
   <complexType base="mars:Property" derivedBy="restriction">
      <simpleType base="mars:Token">
            <!-- urn:hrn:metia.nokia.com/MARS/2.1/status/draft -->
            <enumeration value="draft"   
            <!-- urn:hrn:metia.nokia.com/MARS/2.1/status/draft_approved -->
            <enumeration value="draft_approved" 
            <!-- urn:hrn:metia.nokia.com/MARS/2.1/status/approved -->
            <enumeration value="approved" 
            <!-- urn:hrn:metia.nokia.com/MARS/2.1/status/retired -->
            <enumeration value="retired" 
Now, it is explicit for each literal enumerated value what its HRN URI
should be, being the simple appendage of the literal name to the namespace
and the simple serialization


results in the desired triple:

   ("...", "urn:hrn:metia.nokia.com/MARS/2.1/status", 

There are likely numerous better ways to accomplish this mapping from
value to qualified name, and I've not tried to ponder at length about the
mechanism by which this ultimately would be accomplished (as it would in any
vary from MIME content type to type -- but have simply tried to illustrate
the hole is and one possible path around it.

It is likely that the semantics of the targetNamespace attribute will
its use as per the examples above. The precise attribute used is irrelevant
long as it is possible to achive the necessary namespace declaration for the
literal values.


The benefit of having an global identifier scheme such as HRN defined
above is that one need not worry about the particulars of various schema
or other encoding mechanisms when referring to an abstract concept, such
as within the context of RDF/DAML/etc. I.e. an XML Schema declaration
for an element "foo" does not define or represent the concept "foo", only
one possible serialization of the concept "foo". We should be able to talk
about "foo" irregardless of how statements about it might be serialized
on one encoding or another. And the same scheme then works for concepts,
vocabularies, etc. which have no specification in any MIME content type
or which are encoded in a MIME content type for which there is no fragment
syntax (e.g. IETF RFCs encoded as text/plain ;-)

Please, let's abandon the use of HTTP URIs for namespace identity!
vocabularies, ontologies, etc. are *abstract* resources and thus should be
defined using non-URL URIs! If one wishes to then specify one or more URLs
for schemas or other content streams which provide explicit definition of,
information about, realizations of, or constraints upon those abstract
great, but let's stop using URI schemes intended for identifying content
to identify abstract resources!

In this regard, Topic Maps got it right, by separating the reification of
abstract (or even concrete) resources with their occurrences (realization,
expression, use, description, etc.). We can learn a lesson or two there.

I look forward to hearing the comments and discussion of the above from 
others in this forum. Sorry for the length.



Patrick Stickler                      Phone:  +358 3 356 0209
Senior Research Scientist             Mobile: +358 50 483 9453
Software Technology Laboratory        Fax:    +358 7180 35409
Nokia Research Center                 Video:  +358 3 356 0209 / 4227
Visiokatu 1, 33720 Tampere, Finland   Email:  patrick.stickler@nokia.com
Received on Wednesday, 6 June 2001 05:14:00 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:07:36 UTC