Dedicated, Standardized URI Scheme for QNames? from Patrick.Stickler@nokia.com on 2001-08-20 (www-rdf-interest@w3.org from August 2001)

From: <Patrick.Stickler@nokia.com>
Date: Mon, 20 Aug 2001 10:39:16 +0300
To: www-rdf-interest@w3.org
Message-ID: <2BF0AD29BC31FE46B788773211440431245C0F@trebe003.NOE.Nokia.com>
I've had this idea in the back of my head for well over a year, and from
time
to time have returned to it in consideration as a possible solution to the
RDF QName to URI mapping problem. Up to now, I've regularly discarded
it time after time primarily because it requires a non-trivial and probably
backwards incompatible change to the RDF spec, which is always a major
cause for rejection. Nevertheless, the recent (productive) discussions
regarding
this matter have prompted me to again think about this alternative, and I
thought
it would be good to share the idea with the interest group, if only as food
for
thought -- offering a different perspective on the problem and providing a
point
of comparison for my other proposed solution.

I won't be surprised if this is not a new idea, and has been proposed
already
(and possibly several times) in one form or another -- though I'm not aware
that 
it has.

That said......  here's what I've been thinking...

--

A QName is not a URI, nor is it possible to define a fully reliable, generic
function for mapping between a QName and URI, based on the URI scheme
of the QName namespace itself, which will always be unique (no collisions),
bidirectional, and correspond to a fully valid URI according to the URI
scheme and/or some MIME content type fragment syntax. I consider the above
to have been sufficiently argued and demonstrated in the recent discussions
(even if one might contend that a mapping function need not produce
collisions).

Therefore, it it necessary to define either an explicit mapping between
QNames and arbitrary URIs (not necessarily having any intersection with
the namespace URI, and which is the basis for my other proposal), or to
define 
a specific URI scheme that explicitely represents a QName as a URI,
maintaining 
the distinction between name and namespace components, and therefore to 
perform only a mapping of representation of the two distinct components, 
rather than one of fusion of the two components.

Here's my proposed syntax for such a QName URI scheme:

   qn:{name}:{namespace}

I.e. "A QName URI consists of a name {name} within a namespace {namespace}"

e.g. 

   qn:language:http://www.purl.org/dc/elements/1.1/
   qn:bar:urn:partax:(foo)
   qn:subPropertyOf:http://www.w3.org/2000/01/rdf-schema#

The name of the QName is given first, primarily so that the URI scheme 
parsing is simplified -- such that  "everything after the second colon"
corresponds  to the namespace URI.

It also should make the URI easier for humans to read both because
the most mnemonic portion of the QName (i.e. the name) is provided first and
also
because the name maintains the same position relative to either
a namespace prefix or the URI scheme prefix 'qn:'.

Finally, if some application wishes to dereference either the namespace URI
or some URI derived from the combination of name and namespace, then it
has the information it needs in an explicit and reliable format to do so.

Thus, a QName would officially have two equivalent representations:

1. As an element or attribute name within a serialized instance:

   xx = namespace
   xx:name

2. As a URI:

   qn:name:namespace

Thus, the consistency inherent in the variants 'xx:name' and 'qn:name:...' 
should be easily reconizable (and appreciated) by humans.

And one could then interpret the serialized representation of a QName
as nothing more than a short hand representation for its URI representation,
which could be considered to be its primary representation for SW
applications (or XML applications in general).

--

If RDF were to be further extended to allow the use of QNames in attribute
values as does XML Schema (I think this is already under consideration?),
then 
one could use QNames as a universal naming scheme with consistent shorthand 
notation in all RDF serialized contexts where URIs can occur, and
furthermore 
achieve consistency of representation (using QNames as universal
identifiers)
both in the knowledge base as well as in all serializations. And there
would be a well defined, reliable, bidirectional mapping between
QNames in XML serializations and QName URIs in RDF graphs.

A QName then becomes a truly first-class object on the SW, used as the
primary
mechanism for naming (i.e. the 'qn' URI scheme defines a URN scheme).

Every RDF parser would map a QName to a qn: URI and therfore there
would never occur any collisions, every RDF application would get the
same triples, and every resource would be identified by a fully valid URI,
and mapping back to QNames for re-serialization would be trivial and
consistent (no guessing about where to break the URI into name and
namespace components or having to rely on special URI syntaxes
with known partition characters such as '#' or '/").

--

Any equivalences of QNames to non-qn-URIs could be defined with
mechanisms such as daml:equivalentTo or daml:subPropertyOf
(or rdfs:subPropertyOf).

e.g. 

  <qn:bar:urn:partax:(foo)>  daml:equivalentTo  <urn:partax:(foo(bar))> .

Also, serializations allowing QNames in attribute values of type URI
would greatly simplify human creation (or inspection) of XML instances:

I.e. the following minimally verbose RDF XML fragment

  xmlns:mars="http://metia.nokia.com/MARS/2.1"
  xmlns:lang="http://www.iso.ch/3166-1"
  ...
  <mars:language rdf:resource="lang:en" />

gives us the triple

  [X, qn:language:http://metia.nokia.com/MARS/2.1,
qn:en:http://www.iso.ch/3166-1]

Thus, RDF instances and schemas would be far easier to write by
hand, since it would be expected that most resources would be defined
using QNames, and therefore prefixes can be defined and used anywhere
any resource might be referenced or defined in an RDF or RDF Schema
XML instance.

And this would also alleviate the need for any literal to URI mappings as
the 
primary motivation for that is simply human convenience -- as humans which
must
physically touch the serialized instances do not want to read or write long
URIs
(no debate please about proper use of user interfaces hiding RDF
serializations,
etc. etc. -- people will need to touch RDF XML instances for a long time to 
come, possibly always, so let's not make it too painful an experience, eh?).

Such a qn URI scheme should also then serve XML Schema since it bases
identity already on QNames rather than URIs. With an explicit qn: URI
scheme, it could then claim, after the fact, that it uses URI's, they are
simply all implicitly qn: URIs.

Eh?

So, what do you folks think?

Cheers,

Patrick

--
Patrick Stickler                      Phone:  +358 3 356 0209
Senior Research Scientist             Mobile: +358 50 483 9453
Software Technology Laboratory        Fax:    +358 7180 35409
Nokia Research Center                 Video:  +358 3 356 0209 / 4227
Visiokatu 1, 33720 Tampere, Finland   Email:  patrick.stickler@nokia.com
Received on Monday, 20 August 2001 03:39:24 UTC