proposal to drop DESCRIBE from SPARQL

(this is a personal review comment, like my others; it shouldn't
be mistaken for a SWBPD or SWIG request for changes to SPARQL).

I hereby propose you drop the DESCRIBE construct from SPARQL, 
and rework that part of the spec to show how queries can be 
written which ask for RDF documents in terms of their 
topics and other properties.

Similar (perhaps in some ways better) functionality can be achieved by 
simply asking SPARQL questions which give as their answers 
references to RDF/XML documents.

Refs: 
published WD of 2004-10-12,
http://www.w3.org/TR/2004/WD-rdf-sparql-query-20041012/#describe

live editor's copy at time of writing,
http://www.w3.org/2001/sw/DataAccess/rq23/#describe

The spec has been re-organized a bit, but the basic idea seems 
the same as published in the October WD, which is to provide a 
loose and flexible mechanism by which a server can return a 
useful bundle of information about some entities identified 
via query expressions. I should be clear here that I think this 
is a valuable facility to have available in SPARQL; my only 
objection is that it doesn't need a keyword in the language.

Instead, I propose that queries which use the rdfs:seeAlso
property could serve this purpose. If that property 
doesn't quite meet your needs, please look into designs 
which use similar properties.

>From http://www.w3.org/TR/rdf-schema/#ch_seealso

	5.4.1 rdfs:seeAlso

	rdfs:seeAlso is an instance of rdf:Property that is used to indicate a
	resource that might provide additional information about the subject
	resource.

	A triple of the form:

	    S rdfs:seeAlso O

	states that the resource O may provide additional information about S.
	It may be possible to retrieve representations of O from the Web, but
	this is not required. When such representations may be retrieved, no
	constraints are placed on the format of those representations.

	The rdfs:domain of rdfs:seeAlso is rdfs:Resource. The rdfs:range of
	rdfs:seeAlso is rdfs:Resource.

There are also some notes in the ESW wiki, 
http://esw.w3.org/topic/UsingSeeAlso which might be useful.

Here's the first example from 10.3.2 editor's draft,

	PREFIX foaf:   <http://xmlns.com/foaf/0.1/>
	DESCRIBE ?x
	WHERE    (?x foaf:mbox <mailto:alice@org> )


Here's the seeAlso'd form I propose:

        PREFIX foaf:   <http://xmlns.com/foaf/0.1/>
	PREFIX rdfs:   <http://www.w3.org/2000/01/rdf-schema#>
        SELECT ?doc
        WHERE    (?x foaf:mbox <mailto:alice@org> )
	WHERE    (?x rdfs:seeAlso ?doc)


Some notes on pro/cons:

1. the query is longer, but the query language is simpler

2. implementations barely change; as with DESCRIBE particular datasets 
 and services may or may not have anything to offer in response
 to the query.

3. the query-based design is incrementally extensible - different 
types of description could be requested. We could constrain the 
query by (?doc dc:format "text/rdf+n3") or by reference to the 
rdf:type of the ?doc, eg. FOAF's "PersonalProfileDocument" or RSS1's 
notion of a channel, or by characteristics of the source of the 
describing statements (eg. we could ask for ?docs which have been 
digitally signed by people who match some SPARQL expression).

4. the main difference between the current keyword-based design and 
the property-centric alternative I'm proposing seems to be that in 
the latter, we get the actual RDF, in the former, we get a (potentially
dangling, 404 etc) document reference. I'd expect a common setup to be 
that these are simply references back into the same SPARQL service, 
although similar queries could (depending on nature and whims of the
dataset/service being asked) return rdfs:seeAlsos that point elsewhere 
in the Web. The property-centric approach could then mean more
round-trips to the server --- is this a problem? But it also allows 
RDF to be used to describe the documents being referenced (eg. size in
bytes, number of triples, etc) which could help clients be more 
efficient and selective. 

5. Can we SELECT from the results of a DESCRIBE in a single SPARQL 
expression? I can see something like that making sense in terms of 
partitioning work between a client and server, eg. a dumb server returns 
generic book reviews; local client selects out prices and ratings. If this 
sort of thing is important, perhaps it is evidence for the existing 
keyword-based approach. But I'm not sure DESCRIBE works like that
currently. There's also some relationship to N3's 'log:semantics'
design, perhaps.  In N3, I believe I could query using rdfs:seeAlso
expressions, and then dereference the RDF to populate a queriable 
context. I expect that in SPARQL, such things will get done in 
application code rather than in the QL.

6. How to deal with multiple topics?

Another example (from Editor's copy)

[[	More than one URI or can be given:

	PREFIX foaf:   <http://xmlns.com/foaf/0.1/>
	DESCRIBE ?x ?y <http://example.org/>
	WHERE    (?x foaf:knows ?y)
]]

This time we're asking for a chunk of RDF that describes 
things that are in a foaf:knows relationship and a 
document, <http://example.org/>. Triples that
were "about" (in a loose sense, not just the rdf:subject sense)
either ?x, or ?y, or the doc... would be relevant. Triples which
were "about" the foaf:knows relationship connecting the two people
might be even more relevant, etc etc. The query is pretty vague.

Can this be reformulated using properties? I tried, with a 
mess of optionals. I'm sure something could be cooked up 
with seeAlso or other properties, or rdf:List or rdf:Alt. The 
sense of the query is loose enough that I doubt it worth 
trying to capture it more formally as a complicated bunch of 
OPTIONALs. Doing so is somewhat add odds to the whole point of 
DESCRIBE anyway.


7. seeAlso is good for Semantic Web deployment

Promoting seeAlso and RDF description of RDF *documents* is
good for the Semantic Web, since it helps people find and 
cross-reference pieces of RDF data. SPARQL's "DESCRIBE" 
mechanism is purely internal to the language, currently. It 
allows me to ask a service for RDF that describes some 
particular thing, but all I get is the actual RDF. Encouraging 
that RDF to be made available at GET-able URIs, and 
described with rdfs:seeAlso and other RDF properties, is imho
an important part of getting the Semantic Web deployed, 
crawled, and indexed. It also directs some attention towards 
the important problem of characterising broad classes of RDF 
document, and the constraints (loose/prose or machine-readable, 
RDF-level or XML-level, etc) associated with them.



That's about it. To recap: please drop the DESCRIBE keyword and 
replace the examples with ones that use rdfs:seeAlso. 

cheers,

Dan

ps. I numbered my paragraphs to make my thoughts look ordered; don't  
suppose I fooled anyone.

pps. http://www.w3.org/1999/11/02-RDFServices/ was an attempt in 
similar vein; now obsoleted by Annotea and SPARQL. I think there 
almost as many usecases for 'describe' functionality as there 
are for Web pages...




 

Received on Monday, 17 January 2005 13:30:11 UTC