W3C home > Mailing lists > Public > www-rdf-interest@w3.org > December 2001

Non-canonical URIs in RDF

From: Jeremy Carroll <jjc@hplb.hpl.hp.com>
Date: Mon, 3 Dec 2001 14:01:00 -0000
To: <www-rdf-interest@w3.org>
Message-ID: <JAEBJCLMIFLKLOJGMELDGEJKCCAA.jjc@hplb.hpl.hp.com>

I am currently working on some ARP bug fixes.

One issue I hope for help with is exemplified by the following file.

The crucial issue is that the base URI is legal but not well-chosen.


<rdf:RDF xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'
   <rdf:Description rdf:ID="test1">
     <rdf:value rdf:resource="test2"/>
   <rdf:Description rdf:about="">
     <rdf:value rdf:resource="#test1"/>


What is the equivalent n-triples?

[Note: xml:base is currently not allowed in RDF/XML, thus this could/should
be read as reading the file from the URL 'http://example.org']

Possible answers:

[A] (consistent correction)
<http://example.org/#test1> <rdf:value> <http://example.org/test2> .
<http://example.org/> <rdf:value> <http://example.org/#test1> .


[B] (minimalist correction)
<http://example.org/#test1> <rdf:value> <http://example.orgtest2> .
<http://example.org> <rdf:value> <http://example.org/#test1> .


[C] (not my job)

Note [A]'s treatment of rdf:about="" inserts the missing /; in fact [A]
systematically inserts the missing /. [B] only inserts the missing slash
where otherwise the generated URI is not a URI. I think [B] follows the
algorithm identified in RFC 2396 more accurately, which basically doesn't
cover this issue. [C] is fairly attractive, but what is the nature of the
error. In [C] we may choose to only reject trying to use a fragment relative
URI with this base, and treat the other one as simple concatenation.

Try doing the same when the base URI is  (still no
trailing slash).
Approach B would treat rdf:about="8" and rdf:about ="10" differently!

If we go for [A] do we 'correct' the URI in


<rdf:RDF xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'
   <rdf:Description rdf:about="http://example.org">


If we correct that URI, where do we stop? We could 'correct' quite a few
dubious URIs e.g.
http://www.HP.com/   ftp://www.hp.com:21/ HTTP://www.hp.com/ . I assume we
stop short of DNS lookups, actually fetching URLs to see what we get, and
cracking opaque URI forms like mailto:

Received on Monday, 3 December 2001 09:01:24 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:07:38 UTC