Re: URI: Name or Network Location? from Patrick Stickler on 2004-02-20 (www-rdf-interest@w3.org from February 2004)

From: Patrick Stickler <patrick.stickler@nokia.com>
Date: Fri, 20 Feb 2004 12:29:30 +0200
To: "ext Benjamin Nowack" <bnowack@appmosphere.com>
Cc: www-rdf-interest@w3.org, Hamish Harvey <david.harvey@bristol.ac.uk>
Message-Id: <AAB97142-638F-11D8-AE1B-000A95EAFCEA@nokia.com>
On Feb 19, 2004, at 18:30, ext Benjamin Nowack wrote:
> here we go:

Good questions. Here's how I'd answer them.

> Q1:   given a URIref of a resource, how can we get a representation of 
> that
>       resource?

GET {URI} HTTP/1.1

> Q2:   given a URIref of a resource, how can we get a description of 
> that
>       resource?

MGET {URI} HTTP/1.1

> Q3:   given a URIref of a resource, when we dereference that URI, how 
> do we
>       know if we get a description or a (non-formal) representation?

If you use GET, it's a representation.
If you use MGET, it's a description.

This is architecturally defined.

(whether those representations/descriptions are useful, or accurate,
  etc. are quality issues, not architectural issues)

> Q4:   after receiving a description, how do we know if it describes the
>       dereferenced URI?

You can attempt to extract a concise bounded description of the resource
from the returned graph. If the graph is empty, the response didn't 
contain a
description. If the extracted description graph is not equal to the 
returned
graph, the response wasn't a concise bounded description (contained
exteraneous statements).

> Q5:   given a URIref of a resource, when we dereference that URI, how 
> can
>       we be sure that we get a description only?

Presuming you use MGET, same answer as the answer to Q4.

> Q6:   given that representations have a dereferencable URIref on the 
> web,
>       how to we make a description of that resource available?

By introducing explicit machinery into the architecture which makes
the distinction between representation and description unambiguous
in the request.

The URIQA methods MGET, MPUT, and MDELETE accomplish this.

> Q7:   the description of a resource is a representation on its own, how
>       can we talk about descriptions (= descriptions of descriptions)

Each server should include in the response header a URI which denotes
the entity returned. Thus, each concise bounded description of a 
resource
returned by an MGET request should be denoted by a URI distinct from
the URI of the described resource, and that distinct URI should be
specified in the response header.

Agents can then use that distinct URI to make statements about, or
request descriptions of the description itself.

(btw, I wrote the above before reading your answers... tabula rasa, 
etc.)

>
>
> re Q1 (getting some representation):
> (at this stage we are not interested in what we actually get back.)
> 1) do a simple HTTP GET on the URI. follow http redirects etc.
> 2) utilize a cache, caching service, or whatever, if you like
> no conflicting opinions on that I guess.
> classical Web approach.
>
> ---------------
>
> re Q2 (trying to get a description):
> (at this stage we still don't really look at the resulting data.)
> 1) URIQA approach:
>    do an MGET on the URI

It's so simple.   ;-)

>
> 2) client header approach
>    do a GET and send an "Accept: application/rdf+xml" header or
>    an special type of User-Agent header that the server can detect.

Problems:

a) How do you know that the server has understood the header?

    solution: server reiterates special header in response to indicate
              it was understood in request

b) How do you prevent the server from returning a huge representation
    when it doesn't understand the special header?

    solution: make two requests, first a HEAD and then a GET [IMO, 
unacceptable]


>
> 3) rdf autodiscovery approach
>    do a GET, extract a <link rel="meta" type="application/rdf+xml"
>    href="" /> (or rel="alternate") tag in the resulting data and
>    dereference this/these URI/URIs.

Fragile, encoding specific, and worst, requires download of entire
representation to get link to metadata description. [unacceptable]

Alternative approach is to have that link in the HTTP response header,
but again, that requires two requests, one to get the link, and another
to get the description. [unacceptable]

>
> 4) remote registry approach
>    query some (central) service that stores descriptions (SOAP/GET/...)

How do you know what service(s) to query for an *authoritative*
description when all you have is a URI? [unacceptable]

>
> 5) local service approach
>    query a service hosted at the server of the URIref.
>

Requires two requests. First an OPTIONS to get the URI of the service,
and then a request for the description. [unnacceptable]

> 6) local "metadata file" approach
>    GET an agreed-on file that carries a description (something similar 
> to
>    robots.txt)
>

Violates the sanctity of namespace ownership as well as URI opacity. 
[unnaceptable]

> 7) URI extension approach
>    Add an agreed-on parameter (e.g. ?format=rdfxml) to the URI in 
> question
>    and do a GET on the resulting URI.

Violates the sanctity of namespace ownership as well as URI opacity. 
[unnaceptable]

>
> 8) Embedded RDF approach(es)
>    do a GET and extract embedded RDF from the resulting data
>

Fragile, encoding specific, and worst, requires download of entire
representation to get link to metadata description. [unacceptable]

> 9) server header approach
>    do a GET/HEAD on the URI and look for an agreed-on HTTP header in
>    the resulting data, which points to the description(s). do a GET on
>    this/these URI/URIs then.

Requires two requests. [unacceptable]

>
> 10) offline approach
>     buy the book "1000 essential SemWeb addresses" and look up the
>     description via the alphabetically sorted URI index..

For SW agents, requires one to scan in content and use OCR to
be machine readable -- but no garuntee it will be machine
understandable. [unacceptable]  ;-)

>
> ---------
>
> re Q3 (description or representation)
>
> for any of the approaches above, we can (have to?) check the returned
> data for (valid) RDF.
>
> ---------
>
> re Q4 (wanted description)
>
> there is some rdf:about= or rdf.ID="[dereferenced URI]" in the 
> description.
> for approach 1, 2, 3, 4, 5, 7, 9 a solution, which makes sure that only
> related rdf is served, _can_ be implemented (or is central part of the
> approach, e.g. 1, 4, 5, maybe 9 as well). for approach 1, 2, 4, 5, 7, 
> one
> _can_ implement a solution that doesn't need multiple requests, 
> approach
> 1, 4, 5 could generally offer/standardize such a feature.
>
> ---------
>
> re Q5 (request description only)
>
> this _can_ be implemented with approaches 1, 2, 4, 5, 6, 7, 9.
> for 3 and 8, we have to read at least a part of the representation 
> (e.g.
> the html head tag). not knowing which approach (if it does at all) a
> semantic site follows, we can never be sure that we get back rdf. http
> headers can help saving bandwidth (not found, not implemented, etc.)
> approaches 1, 4, 5 (can) have an integrated save-bandwith feature.
>
> some of the approaches (2, 3, 7, 8,) above get complicated when a URI
> identifies binary resources (imgs, etc.). one solution to this could
> be url rewriting.
>
> ---------
>
> re Q6 (server perspective, descriptions of deref'able resources)
>
> 1, 4, 5, 6, (and 10!) work for any resource.
> 2, 7, 9 work for dynamically generated/rewritten representations
> 3 works for text documents
> 8 works for certain resources (xmp, exif etc.)
>
> ---------
>
> re Q7 (descriptions of descriptions)
> (assuming that we don't combine different approaches which
> puts the problem just on another level)
>
> I'm not sure, but I think 1 returns some sort of URI for
> the MGET which can be used for a separate MGET.

Correct.

> a similar method
> could be used for 4 and 5.
> 7 is a little bit more complicated as we can't use
> uri?format=rdfxml&format=rdfxml. an alternative is to use a
> changing argument, e.g.
> uri?format=rdfxml describes uri
> uri?format=rdfxml2 describes uri?format=rdfxml
> uri?format=rdfxml3 describes uri?format=rdfxml2
> ...
>
> hm, 9 could work with dynamically generated headers, too.
>
> 10: "reviews of '1000 essential SemWeb addresses'". perfect.
>
> ---------
>
> there are lots of additional questions, limitations, or requirements
> one might have, e.g.
> - not wanting to use content-negotiation

Right, because concise bounded resource descriptions can be expressed
in a variety of forms, not just RDF/XML. If conneg is used to 
differentiate
between representation and description, it cannot be used to interact
with different encodings of a description.

> - not wanting to replace apache (hm, maybe there will be a mod_uriqa?)

It's on my TODO list... (though anyone is welcome to beat me to it ;-)

> - having a hosted server with no root access
> - not being able to use url rewriting
> - wanting an authorative description
> - static pages only
> - (x)html should validate
> - ...
>
> but I think I already wrote too much.

Not at all. This was IMO a very productive exercise.

> hope this helps someone (or me
> a few months later) a little bit and does not lead to another endless
> thread ;-)
>
> /me needs a break now.

;-)

Patrick



>
> benjamin
>
> --
> Benjamin Nowack
>
> Kruppstr. 82-100
> 45145 Essen, Germany
>
>
> [1] http://esw.w3.org/topic/SlashRedirection
>
>

--

Patrick Stickler
Nokia, Finland
patrick.stickler@nokia.com
Received on Friday, 20 February 2004 05:58:17 UTC