RE: SemWeb Non-Starter -- Distributed URI Discovery

"The other approach that springs to mind, not as far as I'm aware
standardized or particularly deployed (but I'm sure some folks will be
using it) is to use a special query, maybe:

http://example.org/food/blah?about#

to get the lowdown on http://example.org/food/blah

It would be relatively straightforward to implement, e.g. get Apache
to redirect to a query on a triplestore."

This is an interesting solution.  I definitely agree that it would restrict
the URI creator/originator's freedom.  However, what if we just used another
feature of HTTP to handle this?  I'm thinking of the Accept HTTP header.
Here's a snippet from the rfc (http://www.w3.org/Protocols/rfc2068/rfc2068) 

"
The Accept request-header field can be used to specify certain media
   types which are acceptable for the response. Accept headers can be
   used to indicate that the request is specifically limited to a small
   set of desired types, as in the case of a request for an in-line
   image.
"

I think it should be feasible to issue this sort of request:

GET /food/blah HTTP/1.1
Host: example.com
Accept: application/rdf+xml

In theory, this should return /food/blah *only* in rdf+xml.  This request
should return different results than a "regular" http/html request:

GET /food/blah HTTP/1.1
Host: example.com
Accept: text/html, */*


I know that many servers don't respect the Accept: header, but it sure seems
like it is designed to supply different types of media for identical URLs.

Thoughts?

dave

-----Original Message-----
From: Danny Ayers [mailto:danny.ayers@gmail.com] 
Sent: Monday, March 28, 2005 2:04 PM
To: Stephen Rhoads
Cc: semantic-web@w3.org
Subject: Re: SemWeb Non-Starter -- Distributed URI Discovery



On Mon, 28 Mar 2005 10:49:42 -0500, Stephen Rhoads <rhoadsnyc@mac.com>
wrote:
> 
> >>Benjamin Nowack wrote:
> >see my reply to max, stephen's semweb agent unfortunately doesn't
> >have an rdfs:seeAlso yet. All it starts with is a single URI.
> >
> >Imagine a bot with a T-Shirt:
> >[[
> >   I was lost on the Semantic Web,
> >   and all they gave me was this stupid URI.
> >]]
> >Now, what should the bot do next?
> 
> Exactly.
> 
> A SemWeb Agent discovers some URI which, given the context in which it was
found, looks interesting or otherwise important.  The Agent wants to obtain
more information about the URI (an rdf:type would be nice, for starters) but
doesn't know where to go to obtain the information.
> 
> There needs to be a simple, straightforward solution to this problem.
Imagine a SemWeb-enabled phone or PDA; should that device be required to
follow rdfs:seeAlso links or spider a website in search of information about
a URI?

Ok, just for the record there are at least four potential solutions
available, some even deployed (to a small extent), which don't really
need any new machinery -

There's "autodiscovery" :

Given a URI, you do a HTTP get on it, asking for anything vaguely
HTML-like. In the <head> block you look for something like:

<link rel="meta" type="application/rdf+xml" title="FOAF" href="foaf.rdf" /> 

There are a load of notes on the FOAF Wiki [1]. Several of the
blogging tools incorporate an RSS version of this - the feed may or
may not be RDF/XML, there's a quasi-spec for RSS at [2], a more
thorough spec for Atom at [3].

Another solution, slightly deployed is RDDL [4]: 
[[
This document describes the Resource Directory Description Language
(RDDL). A RDDL document, called a Resource Directory, provides a
package of information about some target, including:

    * Human-readable descriptive material about the target.
    * A directory of individual resources related to the target, each
directory entry containing descriptive material and linked to the
resource in question.
]]
There is an RDF Schema, though the primary format is an XHTML module -
lots of XLink involved. RDDL is a nice idea, but does look rather
over-engineered compared to, errm, say autodiscovery.

Somewhere around this space there's also GRDDL [5], where you could
supply an XHTML doc at the given URI (or whatever XML format you
liked), and a <link rel="..." ...> tells you how to turn it into
RDF/XML.

The assumption in all these cases is that the RDF obtained will
contain some information about the given URI. There's no guarantee
this will be the case, but in most practical cases it would make sense
to include some association (RSS has channel & link, which may point
to the RDF/XML and URI of interest i.e. 'home page').

The other approach that springs to mind, not as far as I'm aware
standardized or particularly deployed (but I'm sure some folks will be
using it) is to use a special query, maybe:

http://example.org/food/blah?about#

to get the lowdown on http://example.org/food/blah

It would be relatively straightforward to implement, e.g. get Apache
to redirect to a query on a triplestore.

The argument against this general approach is that it messes with the
URI creator's freedom to do what they like with their space. But maybe
putting something in the query would be less of Web-breaker than e.g.
robots.txt. Anyhow, check the URIQA doc [6] for why nothing will work
(except URIQA).

Whatever, any solution would have to be consistent with the URI spec
[7] and be tolerated by webarch [8], (and allow slashes and hashes for
RDF properties).

At a slight tangent, what would be nice would be an RDF protocol that
could support the very minimum of distributable queries: retrieve
descriptions of any resource given the URI, any URI. The distributable
part could be provided by a response along the lines of "I know little
about that resource, but the agent behind this other URI might be able
to help".

Cheers,
Danny.

[1] http://rdfweb.org/topic/Autodiscovery
[2]
http://diveintomark.org/archives/2002/06/02/important_change_to_the_link_tag
[3] http://diveintomark.org/rfc/draft-pilgrim-atom-autodiscovery-02.html
[4] http://www.rddl.org/
[5] http://www.w3.org/2004/01/rdxh/spec
[6] http://sw.nokia.com/uriqa/URIQA.html
[7] http://www.ietf.org/rfc/rfc3986.txt
[8] http://www.w3.org/TR/webarch/

-- 

http://dannyayers.com

Received on Tuesday, 29 March 2005 20:51:47 UTC