Re: Coments on first working draft of SPARQL

[Continuing a discussion on the place of b-nodes in queries.]

From: "Seaborne, Andy" <andy.seaborne@hp.com>
Subject: Re: Coments on first working draft of SPARQL
Date: Sun, 07 Nov 2004 17:16:51 +0000

[...]

> >>>How are blank nodes handled in triple patterns?  For example, does the
> >>>triple pattern
> >>>	( ?x ex:r _:v )
> >>>match the RDF graph
> >>>	ex:a ex:r _:a .
> >>>	ex:a ex:r _:b .
> >>
> >>Your comments suggest that a section devoted to the details around
> >>bNodes would be helpful.  This has been started in the editors working
> >>draft.
> >>
> >>The query syntax does not allow bNodes in queries. bNodes can not be put
> >>in query requests and that needs to be explained somewhere.
> > 
> > 
> > The working draft has explicit wording to the contrary.
> 
> This definition is not the syntax for the language. The definitions at 
> this point of the document set up terminiolgy that works on patterns in 
> queries.  These graph patterns can be combined to produce other patterns 
> so allowing bnodes helps this if this is thought of as subqueries.  More 
> below.

HUH???????????????????????

Then just what is this definition supposed to be about?
To pick just a few (in context, I hope) bits of the section.

	2.2 Triple Patterns

	The building blocks of queries are triple patterns. 
	...
	A triple pattern applied to a graph ...
	...
	Definition: Triple Pattern 
	The set of triple patterns is [something that allows blank nodes in
	both the subject and object positions].

This sure sounds like it is describing the language, particularly in the
absence of any other prose on the subject.

> If you have suggestions for improving the approach taken in the document, 
> please let me know.

Yes, please do not start out with a description of something that is not
part of the language.  If bnodes are indeed not part of the language, then
they should not be mentioned in any context related to the language.

[...]

> > The SPARQL grammar appears to agree with these definition.  Of course, that
> > grammar is not very well written, as it makes literals include URIs.
> 
> 'Literal' in the grammar isn't an RDF literal - it's a constant term.  We 
> will change the wording.  

Good.

> Constants are URIs, RDF plain literals, typed 
> RDF literals and the convenience forms for xsd:integers and xsd:doubles.
> 
> [39]  	Literal  	 ::=  	URI | NumericLiteral | TextLiteral
> 
> (TextLiterals include typed RDF literals - that could be better named)
> 
> 
> The production for a TriplePattern is:
> 
> [16]  	TriplePattern  	 ::=  	'(' VarOrURI VarOrURI VarOrLiteral  ')'
> [17]  	VarOrURI  	 ::=  	<VAR> | URI
> 
> so RDF literals are not allowed as subjects.  In the syntax of the 
> language, bNodes can't appear.

On the contrary, the syntax of the language permits a URI to be a bNode (or
at least something of the form of a bnode) via

[43]	URI		 ::= 	QuotedURI | QName
[44]	QName		 ::=	<QName>
[48]	<QName>		 ::=	(<NCNAME>)? ":" <NCNAME>
[62]	<NCNAME>	 ::=	<NCCHAR1> (<NCCHAR1> | "." | "-" | ["0"-"9"] | "\u00B7" )*
[61]	<NCCHAR1>	 ::=	["A"-"Z"] | "_" | ...

So a URI can be, for example, _:A

[...]

> > Huh?  How does this work?  Is this really going to part of the SPARQL spec?
> > If so, it exposes a part of RDF that I had safely thought was hidden.  
> 
> Which part of RDF did you think was hidden?  

My reading of the RDF specification indicates that there is no difference
between

	_:a ex:b ex:c .

and

	_:z ex:b ex:c .

These are the ``same'' RDF graph, at least so far as RDF is concerned.
This means that the identity of bnodes is hidden in RDF.

> Many RDF toolkits do allow 
> access to bNodes - for example, the ability to add properties when 
> creating an RDF graph.

Well then they are going beyond the RDF specification.  That is their
perogative, of course,

> It's not a matter for DAWG to define how RDF APIs work.  

Agreed.  

> When used 
> remotely, SPARQL queries are serialized and results come back in encoded 
> form and there is no mechanism for maintaining bNodes across the network - 
> just a way to give a document scoped id so that within the document, 
> bNodes can be differentiated.

Agreed, except that according to my reading of the working draft bnodes are
allowed in the syntax of the language, which means that they can somehow be
transmitted across the network.

> If used locally, how the implementation returns results from a query is an 
> implementation decision and is not going to be defined by DAWG.  Some 
> systems will return whatever graph object the query happens to find - then 
> this object can be used for further (non-query) API operations such as 
> adding properties.

Sure, but this would be a non-sanctioned extension.

> Example of local use might be:
> 
> results = queryExecute(
>          "SELECT ?person WHERE ( ?person foaf:mbox <mailto:joe> )") ;
> for ( solution in results)
> {
>     x = solution.get("person") ;
>     x.addProperty(foaf.name,"Joe") ;
> }
> 
> the return type of solution.get will be whatever the RDF toolkit chooses 
> to do about implementing the graph.
> 
> It also means that query structures could be created that do involve 
> bNodes - this can't be done in the syntax of the language 

See above.

> but if the 
> abstract syntax tree is constructed programmatically, then local object 
> might be included - toolkit implementation decision and not to do with 
> DAWG.  

Sure, but, again, this would be a non-sanctioned extension.

> Making TriplePatterns more general (including bNodes) than the 
> syntax allows, is just a way of recognizing this direct use of query on a 
> local RDF graph.

I do not think that this is a legitimate approach to take.  In my view you
are sanctioning something that should be a non-sanctioned extension.

> I suspect we have different underlying views on how RDF applications are 
> going to be constructed.  My hope is that SPAQRL is neutral to that - if 
> you see some approaches being made impossible, or difficult, then please 
> let us know.

Hmm.  Well if SPARQL does indeed not allow bnodes, then it may indeed be
neutral.  However, the working draft is definitely not neutral - for
repeatable results to queries using bnodes it requires RDF stores to
maintain the identity of these bnodes, which I do not believe that an RDF
store need do.

peter

Received on Sunday, 7 November 2004 23:19:31 UTC