Re: Blank node identifiers in FILTER clauses from Enrico Franconi on 2006-07-14 (public-rdf-dawg@w3.org from July to September 2006)

From: Enrico Franconi <franconi@inf.unibz.it>
Date: Fri, 14 Jul 2006 02:48:09 +0100
To: RDF Data Access Working Group <public-rdf-dawg@w3.org>
Message-Id: <95D32FF1-0D83-4196-BCCE-65C7E089FF74@inf.unibz.it>

On 5 Jul 2006, at 14:55, Seaborne, Andy wrote:

>> For example, consider the data set with these three triples:
>> <s1> <v> <o1> .
>> <s2> <v> <o2a> .
>> <s2> <v> <o2b> .
>> The user wants to find those subjects which are related via the
>> verb <v> to at least two objects.  The desired solution
>> sequence is { <s2> }.  The user writes his query this way:
>> SELECT ?x
>> WHERE { ?x <v> _:a . ?x <v> _:b . FILTER (_:a != _:b) }
>
> <o2a> and <o2b> may be names for the same object in the domain of  
> discourse.
> In general, it isn't possible to conclude anything about numbers of  
> things in
> RDF.  It is in OWL.

Indeed. The whole example will never work. Even in the case of  
extending the semantics of BGPs to handle the inequality builtin  
predicate (solution 1), since distinct URIs are never necessarily  
different, the query will always return the empty set.
The example would work if you use literals instead of <o2a> and  
<o2b>, and the builtin predicates are defined over the domain of the  
literals. Builtin predicates over URIs will never be meaningful, I  
guess. (Equality would be meaningful, but trivial).

> I find the alternative of relying on the presence or absence of a  
> named
> variable in the SELECT clause a very confusing  way of going about  
> it - one
> part of the syntax indirectly affects another part of the query.   
> It also does not extend to queries with more than one BGP in them.

Agree.

>> I see four possible resolutions:
>>
>> 1. (My preference) the scope of a blank node identifier is
>> an entire FilteredBasicGraphPattern, not just a basic graph
>> pattern.  To do this, we need to extend the definitions in
>> section 2.5 so that they define the solutions of a
>> FilteredBasicGraphPattern rather than just the solutions of a
>> basic graph pattern.  I can see how to do this with the
>> simple entailment mapping definition; I don't see how to do
>> this with the general E-entailment definition.
>
> My preference as well.

This amounts at extending RDF entailment to handle builtin predicates  
over the domains of the literals. I guess that, while this can be  
defined semantically, it will lead to many complexity issues due to  
the interplay between variables, bnodes (in the binding of  
variables), literals (in the binding of variables), and builtin  
predicates.

> I would remove the possibility of blank nodes (and general  
> expressions) in the functions isIRI/isLiteral/isBlank, restricting  
> them to named variables only, because these really work on the  
> terms of the bindings, not the values.

This makes sense to me.

> I would like to see a proposal for (1) from one or more of the  
> original
> contributors of the current text (Enrico, Bijan, Pat).

Mmmh. It seems to me that this is really about adding to RDF a bunch  
of (typed?) builtin predicates as properties (such as "=", "<>", ">"  
over the integers, for example), and deciding entailment of such  
enriched RDF graphs from standard RDF graphs. Mmmh...

>> 2. We prohibit blank node identifiers in FILTER clauses as
>> inherently meaningless or deceptive syntax.
>
> OK - but less of a preference.  For me, this is a fall-back from  
> (1) that we
> can choose if we do not manage to get agreement around (1).

Actually, an easy way out for lazy people like me... :-)

cheers
--e.

Received on Friday, 14 July 2006 01:49:51 UTC