Re: Resources and URIs from Dan Brickley on 1999-12-07 (www-rdf-interest@w3.org from December 1999)

From: Dan Brickley <danbri@w3.org>
Date: Tue, 7 Dec 1999 17:34:40 -0500 (EST)
To: Sergey Melnik <melnik@DB.Stanford.EDU>
cc: R.van.Dort@Everest.nl, Gabe Beged-Dov <begeddov@jfinity.com>, RDF Interest Group <www-rdf-interest@w3.org>
Message-ID: <Pine.LNX.4.20.9912071642500.28786-100000@tux.w3.org>

On Tue, 7 Dec 1999, Sergey Melnik wrote:
[...]
> the current serialization syntax can produce models containing resources
> that are not explicitly named in the serialization. If a third person
> wants to refer to such resource, it is essential that every parser is
> guaranteed to produce the same URI for the resource. Period.

For reifications of triples, I'm inclined to agree, since the threee
defining information items are present (subject, predicate,
object) whenever an URI is generated for some RDF statement.

For anonymous resources, I don't think your proposal can work in the way
you hope.  URIs are just more information about a resource, as are
temporary placeholder URIs. 

	 "it is essential that every parser is guaranteed to produce the same URI
	for the resource. Period."

This is a very strong statement, and one that could be qualified in a
number of ways.

Do you mean: produce the same generated URI for the same (anonymous) resource

	- given byte-for-byte identical XML input (from same source)
	- given byte-for-byte identical XML input (from different sources)
	- given statement-for-statement identical graphs from varying sources

Here's a problematic example. Many different copies of this file could be
scattered around the web. While they're byte for byte identical, they
describe different WebMasters. I don't believe that fact that the XML is
byte-for-byte identical gives us reason to assign the same
content-derrived identifiers to the anonymous node...

Example 1: same XML bytes, different servers

	<rdf:Description about="/robots.txt">
	<creator>
		<WebMaster> <!-- an anon node of type WebMaster -->
		<emailAddress>webmaster</emailAddress>
		<notes> Contact the webmaster regarding the robots.txt file</notes>
		<homePage rdf:resource="/~webmaster/"/>
		</WebMaster>
	</creator>
	</rdf:Description>
	<!-- this RDF fragment will make sense on many servers and might
	be copied verbatim, with the relative URIs resolved at parse-time -->

Example 2: same XML bytes, same server, different resource.

	<rdf:RDF>
	<WeatherReport>
	<todayCloudCoverPhoto rdf:resource="/weathertoday.png"/>
	<description>this is a daily weather report; the
	todaysCloudCoverPhoto property is a reference to a dynamically changing
	resource, created by a different person each day</description>
	</WeatherReport>
	</rdf:Description>
	</rdf:RDF>

Feeding this to an XML/RDF parser we'd get an anonymous resource of type
"WeatherReport". On different days we might get descriptions of different
(anonymous) WeatherReports.

If we're not told the URI for a resource, we're often pretty much
stuck. Writing out parsers so they try to always generate the same
identifier given sufficent context (server, date, content-negotiation,
language-negotiation, bytes-to-parse etc) seems to me fraught with
difficulty. 

My inclination is to run the other way and make sure that these URIs are
evident as generated IDs, so that processors can always tell when the URI
was cooked up by some parser instead of being widely agreed.

Dan

Received on Tuesday, 7 December 1999 17:34:57 UTC