RE: Using URIs to identify non-information resources

Lars,

Thanks for your comments.  (I wish more people would comment!)  More
followup inline below . . . .

> > * Lars Marius Garshol
> > |
> > | . . . If you resolve a URI and it returns 303 you know that the
URI 
> > | might identify something (what you got back, or what it 
> > | described, but you can't tell which),
> 
> * David Booth
> | 
> | That seems overly pessimistic.  If the URI owner wants you to know 
> | what the URI identifies then you certainly *can* tell which it 
> | identifies, because you will be forwarded to a document 
> | that will tell you explicitly.
> 
> Provided the "you" is a human being, yes, but surely the 
> point of this must be that a machine can determine whether 
> the URI "identifies" a resource or not.

If the returned document is written in a machine-processable language,
such as RDF, then the machine can indeed determine what the URI
identifies.  In fact, HTTP content negotiation or other mechanisms can
be used to provide both human and machine-processable documents as
appropriate.

> | Furthermore, if it was a thing-described-by.org URI like 
> | http://thing-described-by.org?http://dbooth.org/2005/dbooth/ 
> | then you can tell by inspection (without performing an HTTP 
> | retrieval) that the 
> | URI does not directly identify an information resource [...]
> 
> That is true, *provided* that this way of doing things 
> becomes enshrined in web standards. So far this hasn't happened.

No, it does not even need to be enshrined in any standards.  The
semantics of thing-described-by.org URIs are explicitly stated at
http://thing-described-by.org/#Delegation_of_Authority
We don't need to create any new standards for you to know what a
thing-described-by.org URI means.

The only things needed to make thing-described-by.org work are: (a) for
software to know about it (or find out about it) and know that it is a
303-URI-forwarding service, and the software can therefore optimize away
the HTTP access to that site; and (b) that the software sufficiently
trusts the site as actually performing the service that it claims to
perform.

Furthermore, there could be many 303-URI-forwarding sites just like
thing-described-by.org, and sites could provide machine-processable
lists of such sites.  It's analogous to the use of search engines on the
Web.  The Web does not need to "endorse" certain search engines for them
to be usable as search engines.  All that matters is that people know
about them (or can find out about them) and sufficently trust the data
they get from the ones they choose to use.

. . .
> | In summary, if URI owners want you to know what their URIs 
> | identify, and they use thing-described-by.org URIs, then it 
> | seems to me that we have a scalable and deterministic solution.
> 
> Well, seen from the point of view of someone who's spent 
> considerable amounts of time trying to convert losslessly 
> between RDF and Topic Maps (where this distinction is 
> built-in), everything is still chaos in RDF-land on this 
> particular point.

I do agree that Topic Maps are much clearer on this than RDF.  But I
think the resolution that the TAG has provided *does* represent a way
out of the chaos, even if it isn't nearly as clean a solution as in
Topic Maps.

I used to be of the opinion that a cleaner fix would be to change RDF to
add some other syntax, such as a "*" operation that could be prefixed to
a URI for this purpose, which would provide the clear distinction that
you have in Topic Maps.  

But the more I've thought about it, the more I think it may actually be
*better* to keep the solution in the http URI space, which is what
thing-described-by.org does.  The reason is that if it is fixed by
adding a "*" operator to RDF, then every other langauge that wishes to
refer to that same subject must also learn the "*" operation.  Whereas
if the distinction is made within the http URI itself, then these URIs
can still be used in any other language and mean the same thing, without
changing anything.  Thus it enables consistent semantics across
languages without requiring other languages to change.

> If people had consistently used thing-described-by.org (or 
> the tdb: URI scheme) that would indeed have solved it. 

They still can, and it can still help!

> However, this would have required all URIs currently defined 
> as part of RDF/RDFS/OWL/SKOS/... to change, since not a 
> single one of them actually "identify" what they resolve to.  
> (I assume it's uncontroversial that owl:Class "identifies" 
> something other than what it resolves to.)

If DanBri's proposal to endorse 302-redirection is accepted, then
purl.org URIs could be used in a manner similar to
thing-described-by.org URIs, albeit without the same level of
optimization that thing-described-by.org URIs provide, because
302-redirection is explicitly temporary.

Perhaps the following would be a reasonable solution for existing
RDF/RDFS/OWL/SKOS/... URIs:  For each one, define an equivalent
thing-described-by.org URI that *does* resolve to a document that
explicitly says what that URI identifies, and give an RDF statement that
indicates the relationship between the new thing-described-by.org URI
and the old URI.  Problem solved?

> The TAG ruling has the problem (from my point of view) that
> 
>  (1) it's too expensive to actually perform this check, and

. . . unless people adopt the thing-described-by.org approach!   :)

>  (2) even if you get a 303 back you don't know that that's because of
>      something that was set up in say 2000 for some other reason. 

But again, if it's a thing-described-by.org URI then you know very
clearly what that 303 means.

>      If the TAG had defined a new error code that would have avoided
>      this particular problem.

True, though revving the HTTP spec would have other negatives, of
course.

David Booth

Received on Friday, 19 August 2005 03:38:39 UTC