About DESCRIBE from Seaborne, Andy on 2004-07-20 (public-rdf-dawg@w3.org from July to September 2004)

From: Seaborne, Andy <andy.seaborne@hp.com>
Date: Tue, 20 Jul 2004 18:06:17 +0100
To: public-rdf-dawg@w3.org
Message-ID: <E864E95CB35C1C46B72FEA0626A2E80803984A5F@0-mail-br1.hpl.hp.com>
The DESCRIBE action is asking a "tell me about" question where the client is
not specifying the exact shape of the information to be extracted but is
asking an open question.  The results depend on which RDF repository is
asked.

For example, asking about <http://www.w3.org> when asking the CVS server
might return information about the version history.  Asking the web server
might return traffic statistics.

Sometimes, the client does not know exactly what information to expect of a
source.  When browsing the web, of browsing an RDF space, the user is
involved and controls the navigation.  One use of DESCRIBE is to grab
"meaningful" chunks of the remote graph so that it can be navigated.  

In the UC&R document, the UC "2.2 Finding Information about Motorcycle
Parts" illustrates open query.  Not only is information about the defective
part returned, but also related parts, here the mounting brackets and the
necessary screws.  The client software did not need to anticipate the
detailed structure of the graph to learn about the additional parts.

The screws also illustrate another issue: bNodes.  When extracting
information from a graph, and when it is serialised into RDF, bNodes do not
retain enough information to be identified again.  It would need the
"triumph:part-number" property and value to find this again, or the whole
query would need to be reissued.

A usage to describing resources is bNode closure - include all the
properties of a node and recursively of any bNode objects.  This defines
subgraphs where the edges are literals, or URI-identified nodes, or bNodes
which are not subjects of any statement.  A general browser could navigate
more of the graph from this.

But it does not work as a general mechanism.  Consider a file of FOAF [foaf]
information - its all bNodes.  The lookup is by well known identifying
properties, such as foaf:mbox or foaf:mbox_sha1sum and a reasonable subgraph
to return for looking up a person would be a "person record"; all the
properties of the nodes typed as foaf:Person node, recursing over the graph
structure until another node typed as foaf:Person is found, then returning
any of its properties/values considered to be identifying.

Examples of systems having features similar (but not exactly the same):

Concise Bounded Descriptions [CDB] .  In URIQA, there is a fixed definition
of what constitutes a "description" but it is not directly part of the
query.  A client asks for a description and gets the CDB.

[TAP]

In TAP, a query is reference-by-description which is a request to find
information items identified by some properties (possibly a more complex
graph match as well) that locate the information of interests.  The result
is one or more items of information as determined by what the KB contains.
The TAP result format is RDF/XML-like but not RDF - so the client software
can find the individual items in the result.

FetchQL (Joseki):

Joseki has a "query language" fetch that takes two forms to identify the
resource of interest: either the URI of the resource or an identifying
property/value pair.  What is returned is server configuration dependent.
Many people use bNode closure.

Try:
http://jena.hpl.hp.com:2020/books (GET the whole data)
and look at one item:
http://jena.hpl.hp.com:2020/books?lang=fetch&r=http://example.org/book/book2
which has bNodes and mixed vocabulary information.

[Aside: that should be the URI for fetch but it gets a bit long for example.
The server accepts the short form - the link works.
Fetch URI: http://jena.hpl.hp.com/2003/07/query/fetch]

Its like GET for a part of the graph.  The DESCRIBE action can have
variables and general pattern to find item(s) of interest.


Collections:

Related to DESCRIBE might be how to handle structures like RDF collections.
It is hard to grab a whole RDF collection (a cons cell list) because of the
indeterminate length.  People writing ontology editors seem to use the
part-GET-like mechanism to retrieve class descriptions.  I don't know how it
works out but we will need to think about retrieving RDF collections.

	Andy


[foaf]
http://www.foaf-project.org/

[CBD]
http://sw.nokia.com/uriqa/URIQA.html

[TAP]
Home: http://tap.stanford.edu/
Reference-by-description: http://tap.stanford.edu/tap/rbd.html
Protocol examples: http://tap.stanford.edu/tap/protocol/examples.html
Received on Tuesday, 20 July 2004 13:06:58 UTC