W3C home > Mailing lists > Public > www-rdf-interest@w3.org > September 2001

RE: Bitzi File Metadata RDF Dump

From: <Patrick.Stickler@nokia.com>
Date: Wed, 26 Sep 2001 20:28:51 +0300
Message-ID: <2BF0AD29BC31FE46B78877321144043114BFE7@trebe003.NOE.Nokia.com>
To: danbri@w3.org, sean@mysterylights.com
Cc: aswartz@upclink.com, gojomo@bitzi.com, www-rdf-interest@w3.org

> > Then again, I wouldn't have any particular objection to a new URN
> > namespace, especially an informal one. I do agree with 
> Patrick that due to
> > the scope of the bitprints being in the interest of the 
> Internet community
> > in general, it would be improper to use HTTP space to identify them.
> Noooooooooooooooooooooooo......!
> The whole *point* of the RDF design is to decentralise the creation of
> machine-friendly descriptions on the Web. RDF is a technology 
> designed by
> people who believed there are better things to do with one's 
> time than sit
> on standardisation committees. A major goal was to have fewer central
> registries, committees, bureaucratic bottlenecks. Yet somehow folks on
> this list often seem drawn back towards these things we were trying to
> escape! 

I'm certainly all for minimizing the overhead of deploying
sources of knowledge on the web, but using HTTP URLs as URNs
for abstract resources or location independent identities
creates far more problems than it solves IMO when things start
to scale up and when one thinks about long term validity of
defined knowledge.

> Committees, official looking URN schemes, centralised 
> content-type
> registries, all these have a role, but can also serve to 
> disenfranchise
> those who lack the resources to go through a 
> registration/standardisation
> process.

I agree that there is great benefit from being able to define
identifiers and identifier schemes without the need for 
registration. But that's not practical if we wish to both
have trully generic identifiers and ensure global integrity of
the uniqueness of those identifiers.

Perhaps what is needed is the realization of the 'vnd.'
URN namespace subtree, or similar, which would allow the ad-hoc
definition of URN namespaces grounded within a particular
internet domain authority. E.g.



'urn:vnd.' {domain name} ':' {namespace} ':' {id data}

That does not, however, alleviate *any* of the issues regarding
business portability for any name which is defined
within the context of company or organization trademarked scope
such as an internet domain name.

HTTP URLs are notoriously fragile as long term identifiers. Faking
URNs with PURLs or similar tricks and calling them HTTP URLs is
just avoiding the real issue here, IMO (and of course, puts you
at the mercy of a centralized organization to manage those PURLs,
no? ;-)

We certainly do need a decentralized means for defining URNs, but
any scheme that does not rely on either a named authority as part
of the URN or depends on some agency to generate non-linguistic
identifiers will fail to ensure global uniqueness and hence will
fail to meet the needs of such URNs in the first place.

Using HTTP URLs as URNs where they identify abstract resources is
IMO a total abuse of the HTTP URI scheme and in violation of the
explicitly defined purpose of HTTP URLs.

So, either you have to deal with location instability and/or trademark 
portability issues, or you have to rely on *some* agency to manage
issuance of identifiers to ensure uniqueness (even if its purl.org).

You can't have your cake and eat it too. You can have one of the following 
three options:

1. secure, unique, abstract, generic identifiers provided by some 
centralized agency (URNs from agencies managing registered 
URN namespaces, or "pseudo" URNs such as PURLs) 

2. secure, unique, authority specific, abstract identifiers grounded
in the authority identity (e.g. 'vnd' URNs as proposed above, managed
by the owner of the authority identity)

3. secure, authority specific, concrete locations grounded in the 
authority address scope (e.g. URLs, managed by the owner of the
authority scope). 

The first option is maximally persistent, maximally portable, but
also incurs maximal bueaurocratic overhead.

The second option is maximally persistent, minimally portable, yet
incurs minimal bueaurocratic overhead.

The third option is minimally persistent, minimally portable, yet
also incurs minimal bueaurocratic overhead.

Those are your options. Which you choose depends on your ultimate needs.

Those of us who are concerned with very-long term persistence and
maximal portability of resource identities will opt for supporting and
helping to optimize the centralized agencies necessary for their use.

At the very least, option 2 should be preferred and employed for all
location independent identifiers.

You *can* be an HTTP URL cowboy if you like, but that won't bring
law and order to the frontier of the semantic web...

Best Regards,


Patrick Stickler                      Phone:  +358 3 356 0209
Senior Research Scientist             Mobile: +358 50 483 9453
Software Technology Laboratory        Fax:    +358 7180 35409
Nokia Research Center                 Video:  +358 3 356 0209 / 4227
Visiokatu 1, 33720 Tampere, Finland   Email:  patrick.stickler@nokia.com
Received on Wednesday, 26 September 2001 13:28:59 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:44:32 UTC