Using urn:publicid: for namespaces

I agree with the comments that have been made on this mailing list
about the inappropriateness of using the http URL scheme for names -
especially those representing RDF schemas.  While proposed URL schemes
such as tag and ark are promising, it would be nice if there was an
existing scheme that RDF schema designers could use right now.  I've
been looking around for possibilities and wondered if it would be
appropriate to use the urn:publicid: scheme for RDF schema namespaces
(this is defined by http://www.ietf.org/rfc/rfc3151.txt and is also
listed as officially registered at
http://www.iana.org/assignments/urn-namespaces).

I have two proposals: the first based on SGML "formal public
identifiers"
(http://www.oasis-open.org/cover/petersonFPI-TAG7030101.html) and the
other based on an informal syntax for public identifiers.

Using formal public identifiers
-------------------------------

Suppose I am defining a schema for tourist information.  A formal
public identifier for its namespace might look like this:

  -//University of Otago//NONSGML Tourism ontology v1.0//EN

This would be encoded as a URN according to IETF RFC 3151 as follows:

  urn:publicid:-:University+of+Otago:NONSGML+Tourism+ontology+v1.0:EN

This isn't pretty, but at least no one is going to expect a Web
browser to find a document using this URI.  In theory it should be
resolvable to a document via an XML Catalog
(http://www.oasis-open.org/committees/entity/spec.html) but I think
this is less of a problem than the expectations caused by using http
URLs for namespaces.

The major problem is that the existing mechanism in RDF for mapping
QNames to URIs (via concatenation) would produce a URI that doesn't
correspond to a formal public ID (which must end with a language
indicator).  For example, the RDF class BungyJump in the above
namespace would have the following odd-looking URI:

  urn:publicid:-:University+of+Otago:NONSGML+Tourism+ontology+v1.0:ENBungyJump

One answer to this would be to make the QName to URI mapping be
dependent on the URI scheme used for the namespace.  For urn:publicid
the algorithm might be to insert ";" and the local name before the
language specifier (":EN").  This would give:

  urn:publicid:-:University+of+Otago:NONSGML+Tourism+ontology+v1.0;BungyJump:EN

which corresponds to the public identifier:

  -//University of Otago//NONSGML Tourism ontology v1.0::BungyJump//EN

It would be nice if the "public text class" (which can only be one of
a few specified values) could be something more useful than NONSGML,
such as NAMESPACE.  However, adding this as a possibility to the XML
specification is probably not easily achieved.  It would also
complicate the QName to URI mechanism which would need to replace
NAMESPACE with something like NAME.

Using informal public identifiers
---------------------------------

Public identifiers don't have to follow the above syntax.  They can
comprise any upper and lower case letters, digits, space characters
and line breaks (white space is normalised to a single space), and any
of the following characters: ()+,-./:=?

Therefore an unofficial syntax could be used to identify namespaces.
One possibility is to use the formal public ID syntax but with the
language part omitted and a convention to end the ID with "::" (so
that concatenation will work for QName to URI mapping):

  urn:publicid:-:University+of+Otago:NONSGML+Tourism+ontology+v1.0;

corresponding to:

  -//University of Otago//NONSGML Tourism ontology v1.0::

Alternatively some other syntax could be used, such as the one
proposed for the tag scheme (http://www.taguri.org/):

  urn:publicid:infoscience.otago.ac.nz,2001-08-10:TourismOntology;

which corresponds to the following public identifier:

  infoscience.otago.ac.nz,2001-08-10//TourismOntology::


Are there any reasons why SGML public identifiers shouldn't be used to
identify namespaces in this way?  Although they were intended
specifically to identify SGML external entities rather than conceptual
things such as namespaces, the urn:publicid scheme has the distinct
advantage of being an officially sanctioned URI scheme that I can
start using today.

- Stephen

----------------------------------------------------------------------
Stephen Cranefield                       
Department of Information Science 
University of Otago                               Phone: 64 3 479 8083
PO Box 56, Dunedin		                  Fax:   64 3 479 8311
New Zealand	           E-mail: scranefield@infoscience.otago.ac.nz
----------------------------------------------------------------------

Received on Thursday, 9 August 2001 22:15:19 UTC