- From: Giovanni Tummarello <giovanni@wup.it>
- Date: Sat, 19 Feb 2005 17:38:29 +0100
- To: public-rdf-dawg@w3.org
Hello all , i have just seen the announcement for the working draft of sparql and decided to take a look at what is naturally likely to be one of the most used pieces of the SW. I have a few observation, hoping they can benefit: a) Supporting something much better than named graphs. No matter how starkly and arbitrarely HP likes to state its papers that "the semantic web is a collection of namable RDF graphs", the truth is different: there is no consensus on this. I therefore cant understand such wide support for this construct. Another way of seeing it might as well be that the semantic web is made by the web of URIs and statements that are made about them by anyone. In fact, truth is RDF is defined monotonic so that in theory all could be merged. Should all be merged? yes.. as long as "context" is somehow preserved about statements. So, thanks to -application define concept of context- to answer -application specific reqirements-. , one could later take a later decision about what to consider Name graph are *one* way to provide context. and clearly a non standard one. Others have proposed quadruples, others quituples, why not supporting them as well? Context could be a "certainty" fuzzy value attached to a statement (select just the statements that are more than 90% certain). or the date the statement was ever issued, or the name of the person who first made the statements (e.g giovanni is an alien) independently from which graph now the statements belongs to.. it is obviously useless to list them all: they're contexts, they're applicaiton specific, they all might be useful and should be supported. Now.. should sparql specification be flooded with specific syntactic constructs to support them all? obviously not. But it might be a start to support the only official constructs to talk about triples and that is.. reification, which is not even mentioned in the current draft. Wouldnt it be fairly simple to add an automatic binding to the "statement" as a 4th node for each triple and this to bind this to the reification node/s? Example SELECT ?name ?mbox ?date WHERE (?g dc:publisher ?name ?triplecontext) (?g dc:date ?date ) (?triplecontext fuzzyont:certainty ?fuzzyval) and ?fuzzyval <0.8 with ?triplecontext binding to the reification node of the said triple. (if any) If you really like named graphs then .. the GRAPH construct simply becomes: SELECT ?name ?mbox ?date WHERE (?g dc:publisher ?name ?triplecontext) (?g dc:date ?date ) (?triplecontext namedGraphs:belongsto "http://example.com/mynamed.rdf") Another way to expressi this syntactically could be with a reificationnode(statement) function or a binding say SELECT ?name ?mbox ?date WHERE ?A(?g dc:publisher ?name) (?g dc:date ?date ) (?A namedGraphs:belongsto "http://example.com/mynamed.rdf") nice? :-) or with a function reificationnode(s o p) .. My impression is that this would cut a large number of pages in the specifications (all the construct specifically devoted to named graph) AND allow the context models mentioned above. .. (side node: yes .. so many triples.. but its just a factor say K .. and this is just when serializing (assuming amore efficent serialization cant be thought.. which is false) when inside a DB obvsiouly the context would be coded in an efficent way) While probably a good start, supporting a useful context construct probabl requires more, which leads to the second point. b) I see there is some support for something similar to the CBD. This seems a very goode idea. CBD are bound to become very useful. But please i suggest the support for very useful subset of the CBD that we call MSG in [1], a Minimum Self Contained Graph. Basically is a blank node closure on a given starting statement (not a node). a CBD is simply the union of all the MSGs involving a starting URI (see also [3] for a complete discussion) . MSG are important becouse of the decomposition properties they have (See the paper for some theory) and becouse they reprpesent the minimum information contribution that can be passed from a peer to another in a distributed system. In our case we use the MSG theory to support context information without the need for reification of each statement, and in turn we use this context node (a reification on any arbitrary triple of the msg) to provide a digital signature INSIDE the rdf model so that the provenence of each statement can be tracked without the need for named graphs. Note that this has been said to be impossible in [2] "As discussed in [X], it is necessary to keep the graph that has been signed distinct from the signature, and other metadata concerning the signing, make about which information to trust.", where X is the Carroll serialization paper... which doesnt make that claim (that i know of). Anyway.. msg support could come in a way of selecting statements. which would then require some operators to work with sets.. Checking a context in the model as highlighted by he paper would be simple, given the abiliy to deal with statements set (a IN operator?) SELECT ?name ?mbox ?date WHERE ?A(?g dc:publisher ?name) (?g dc:date ?date ) where (?x namedGraphs:belongsto "http://example.com/mynamed.rdf") IN msg(?A) To conclude, i get the impression it would be benificial to clearly define the support for named graphs in sparql extension. RDF has been given resource centric APIs, statement centric, ontology centric. its all ok, its all according to the consensus and the reccomendations . But making what is basically a "file centric" approach such a fundamental part of the QL seems primitive at least? please someone convince me of the contrary :-) Thanks for the attention, please note that i am posting after reading a few thread in the ML .. but certainly not all, please apologizes if i am disregarding some major post, i'd be happy to know about. Giovanni [1] http://giovanni.ea.unian.it/temp/WWW2005_signignRDF.pdf [2] Jeremy Carroll, Christian Bizer, Patrick Hayes, Patrick Stickler: Named Graphs, Provenance and Trust <../../bizer/pub/Carroll_etall-WWW2005.pdf> at The Fourteenth International World Wide Web Conference (WWW2005), Chiba, Japan, May 2005. [3] http://giovanni.ea.unian.it/temp/RDFGROWth_workshopISWC2004.pdf Toward widely deployable Semantic Web P2P: tools, definitions and the RDFGrowth algorithm Giovanni Tummarello, Christian Morbidoni, Joakim Petersson, Francesco Piazza, Mauro Mazzieri, Paolo Puliti
Received on Saturday, 19 February 2005 16:39:13 UTC