Re: [TF-ENT] Querying datasets with default plus named graphs from Birte Glimm on 2009-10-12 (public-rdf-dawg@w3.org from October to December 2009)

From: Birte Glimm <birte.glimm@comlab.ox.ac.uk>
Date: Mon, 12 Oct 2009 11:41:12 +0100
To: Ivan Herman <ivan@w3.org>
Cc: SPARQL Working Group <public-rdf-dawg@w3.org>
Message-ID: <492f2b0b0910120341p3c483e42s2e0b619c84bd3ee1@mail.gmail.com>
[snip]
> Just for my understanding, based on your latest text (thanks for having
> added it, b.t.w.!)... if I have
>
> <A> standing for the graph
> :p rdfs:range :AA
>
> <B> standing for the graph
> :p rdfs:domain :BB
>
> <C> standing for the graph
> :x :p :y
>
> then the query:
>
> SELECT ?g
> FROM NAMED <A>
> FROM NAMED <B>
> FROM <C>
> WHERE {
>   GRAPH ?g { :y a ?type }
> }
>
> will return ?g-><A>, right?

I would not say so. In this query you have not merged that data from
the three graphs that you consider. Your query will go through all
three graphs (the 2 named and the default graph) and try to find a
binding for the variables in each graph without considering the data
from the other graphs. I.e., you can first try ?g-><A>, but <A> alone
does not provide a binding for ?type and entailment cannot do much if
you have only the triple :p rdfs:range :AA. Then you go on to ?g-><B>,
but again, the triples from <B> alone do not give a binding for ?type.
Now you try the default graph, but that alone does also not give any
type information and there is no answer for the query.

> (b.t.w., I think the example on the wiki is
> wrong, the first (negative) example should say :y a ?type, shouldn't it?)

I wrongly swapped range and domain. Thanks for pointing that out.

> A practical example may then be another type of request which is
>
> SELECT ?type
> FROM NAMED <A>
> FROM NAMED <B>
> FROM <C>
> WHERE {
>   GRAPH <A> { :y a ?type }
> }
>
> which, essentially, specifies a specific vocabulary for a portion of the
> query, right? (It may be worth adding this example to the text, too, to
> make the situation clearer.)

Again no answer I would say because FROM NAMED <B> and FROM have not
really an effect now (but they would not give an answer anyway as
argued above). The only way you can get your entailment would be

SELECT ?g
 FROM <A>
 FROM <B>
 FROM <C>
 WHERE {
    :y a ?type .
 }
Here you merge all data from A, B, and C into a new default graph and
that gives the answer (y->:AA).

> B.t.w. a practical consequence (if all this is true) is that the user
> will have to specify explicitly all the vocabularies it uses in terms of
> FROM or FROM NAMED clauses to get the right entailements. Which is an
> unfortunate duplication of the @prefix clauses. Ie, one will have to write

You might not need prefixes for everthing. Prefixes you will only need
for the things that you mention in the query, e.g., in the above query
you would only need a prefix for :y. But for the example that you give
below, you are right. Maybe one can add something like AS pre or
prefix pre to the FROM clause?
SELECT *
  FROM <URI-FOR-DC> ** AS dc ** or ** prefix dc **
  WHERE {
    ... something that involves an RDFS entailement with dc:
  }
That would be an extension to the query language itself, so I don't
know whether the group would want to consider such extras.

> @prefix dc: <URI-FOR-DC>
> SELECT *
> FROM <URI-FOR-DC>
> WHERE {
>   ... something that involves an RDFS entailement with dc:
> }
>
> My reference to the owl:import in my earlier mails is that this may
> become easier when using owl, because one can prepare one RDF files that
> says
>
> <> owl:import <URI-FOR-DC> .
> <> owl:import <URI-FOR-SOMETHING-ELSE>
>
> and make a unique FROM in the query on that file; OWL entailement may
> process the owl:import clauses before making the entailement. Somewhat
> simpler for the users.

It should process the imports before because you have to load all
axioms from the imports and consider them for entailments in OWL.

> (Note that OWL 2 RL does not define owl:import as one of its accepted
> terms, although OWL 2 Full does. I wonder whether this is not a simple
> extension of OWL 2 RL that we should allow... Not sure...)

also not sure...

Birte

> Cheers
>
> Ivan
>
>
>
> Birte Glimm wrote:
>> [snip]
>>>> As I understand it, from named can be used to access graphs in the
>>>> data set of the query processor. You can do merges into a fresh
>>>> default graph. Even though this might not be nicest thing in
>>>> particular for some entailment regimes, this is something that needs
>>>> to be addressed in the SPARQL query document. The requirement might
>>>> come from entailment regimes, but entailment regimes are based on
>>>> SPARQL and if SPARQL does not define it, then we cannot use it. I
>>>> personally do not want to raise an issue and a request for that, but
>>>> if others feel like doing it...
>>> I must say I am  a little bit mixed up here, maybe you can help... We discussed the
>>> issues of restricting entailements specific graphs when those graphs are defined through
>>> the named graph mechanism of sparql. But I am now messed up on how the FROM NAMED and
>>> the GRAPH statements would exactly influence entailement, ie when is anything
>>> restricted. Could you try to summarize this for a better understanding? Maybe this is
>>> where my confusion comes from... but I am lost a bit:-(
>>
>> I added a section on this into the entailment regimes doc:
>> http://www.w3.org/2009/sparql/wiki/Design:EntailmentRegimes#Entailment_Regimes_and_Data_Sets
>> but I have the impression that it will not answer your question.
>> Basically, triples in one graph of the data set do not have any
>> influence on any other graph in the data set. For a system supporting
>> RDFS entailment, for examle, you could take the triples from one RDF
>> document, load it into graph A, built a partial RDFS closure (using
>> the ter Horst algorithm) and answer queries by using simple entailment
>> on the partial RDFS closure. Now if you additionally load the triples
>> from another RDF document into graph B, then this has no influence on
>> graph A, so even if graph A contains
>> :a rdf:type :B . (inferred or stated in the originally loaded document)
>> and the document loaded into graph B contains
>> :B rdfs:subClassOf :C
>> you cannot use this to get
>> :a rdf:type :C .
>> as a query answer from graph A. The triples in one graph are not
>> visible in another graph.
>> I am not quite sure what you mean with "restricting entailments
>> specific graphs". Do you have in mind that a query processor provides
>> a certain data set description, say with some default graph, graph A,
>> and graph B, and one of the named graphs, say A, is for queries with
>> RDFS entailement, while the other one (B) is for queries with simple
>> entailment?
>> At the moment that would not be possible in my understanding and, in
>> general, the ways of choosing what entailment regime you want seems
>> not very flexible (but I might overlook something). Let us assume you
>> have a query processor that can do simple, RDF, and RDFS entailment
>> (not too unreasonable I think). As I understand it, that would mean
>> that you can have three endpoints, one for each entailment regime and
>> depending on which endpoint I choose when I query, I get one of the
>> three entailment regimes and I can ask that endpoint via service
>> descriptions what data sets it has etc. What we cannot do at the
>> moment (if I understand it correctly) is to mix entailment regimes in
>> one endpoint, so you cannot say the your query should contain results
>> from graph A under RDFS entailments unioned/joined with results from
>> graph B for the graph B results you want simple entailment. There is
>> no way to specify that in the query and there is no way for an
>> endpoint to communicate that it will use simple entailment for some
>> data set and RDF(S) for another.
>> Provided I get that right, I am not sure how much of an issue that is.
>> I can live with it, but that is my personal opinion.
>>
>> For OWL I can see just what you mention above as something that needs
>> to be addressed, i.e., how can users query for things that are not
>> entailed, but are stated in the ontology and that are important to
>> users (annotations most notable, but imports also fall into this
>> category). If we allow some way of specifying in a query that some
>> part of the query has to be evaluated under one entailment regime and
>> other parts of the query under other regimes, that is fine. Then you
>> can use simple entailment for annotations and OWL or whatever for the
>> rest. If we do not want to go that way, we could also define OWL
>> entailment in a way that does not employ OWL semantics to annotation
>> queries. That is not as nice in my opinion, but it would be a
>> workaround that does not require changes in other specs.
>>
>> Birte
>>
>>> [snip]
>>>>> And what you say is perfectly o.k. in view of the RIF specification.
>>>>> However: in SPARQL, FROM and FROM NAMED are defined  to specify RDF
>>>>> datasets. OWL and RDFS are (or can be expressed in) RDF. RIF rules cannot.
>>>>>
>>>>> That actually may create problems for OWL, too. There is no problem if the
>>>>> OWL ontology in the FROM clause is in RDF. But would the spec allow to refer
>>>>> too OWL ontologies in functional and/or Manchester syntax via the FROM or
>>>>> FROM NAMED clauses?
>>>> Question to the SPARQL implementors/experts. Can I specify my RDF data
>>>> in turtle and query that in accordance with the spec? If not in
>>>> accordance with the spec, do systems support turtle input?
>>>> If yes, then I cannot see, why not functional or manchester syntax.
>>>> This is obviously not normative. Any system might reject non-RDF-XML
>>>> input, but many systems might happily take it.
>>>> If not even turtle is allowed, are there any plans for doing that as
>>>> an optional syntax? If not, I guess we have to live with RDF XML. That
>>>> would probably be the end for RIF though, for OWL RDF ML is normative
>>>> and any conformant system must support it anyway, so it is not as bad
>>>> for OWL.
>>>>
>>> Hm (again:-). Yes, you are actually right, I am not sure the spec says anything. My
>>> impression is that the spec is silent at that point and a URI to a graph amy refer to
>>> any format that the processor understands. If that is so, we may not have a problem with
>>> OWL if the processor understands non RDF/XML formats. Maybe it is worth to add this to a
>>> possible service descriptions, though.
>>>
>>> But it is certainly a problem with RIF. Indeed, turtle may not be a standard format but
>>> it is an RDF serialization syntax. In this sense, both the OWL 2 functional syntax and
>>> the M'ter syntax can be considered as an RDF serialization syntax, because they can be
>>> converted, in a standard way, to RDF. But an RIF rule set _cannot_:-(
>>>
>>> Thanks
>>>
>>> Ivan
>>>
>>>>> I would expect we should be able to do that, but that might affect the query
>>>>> language specification.
>>>> Again, that is up to the general SPARQL/Query spec and however want to
>>>> raise an issue for that can do so.
>>>>
>>>> Birte
>>>>
>>>>> I remember Axel and I had some corridor chat at some point that would allow
>>>>> adding a media type to the FROM (NAMED) clause...
>>>>>
>>>>> Ivan
>>>>>
>>>>>> Birte
>>>>>>
>>>>>>> Ivan
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> Ivan Herman, W3C Semantic Web Activity Lead
>>>>>>> Home: http://www.w3.org/People/Ivan/
>>>>>>> mobile: +31-641044153
>>>>>>> PGP Key: http://www.ivan-herman.net/pgpkey.html
>>>>>>> FOAF: http://www.ivan-herman.net/foaf.rdf
>>>>>>>
>>>>>>
>>>>>>
>>>>> --
>>>>>
>>>>> Ivan Herman, W3C Semantic Web Activity Lead
>>>>> Home: http://www.w3.org/People/Ivan/
>>>>> mobile: +31-641044153
>>>>> PGP Key: http://www.ivan-herman.net/pgpkey.html
>>>>> FOAF: http://www.ivan-herman.net/foaf.rdf
>>>>>
>>>>
>>>>
>>>> --
>>>> Dr. Birte Glimm, Room 306
>>>> Computing Laboratory
>>>> Parks Road
>>>> Oxford
>>>> OX1 3QD
>>>> United Kingdom
>>>> +44 (0)1865 283529
>>>>
>>>
>>> --
>>> Ivan Herman, W3C Semantic Web Activity Lead
>>> URL: http://www.w3.org/People/Ivan/
>>> FOAF: http://www.ivan-herman.net/foaf.rdf
>>>
>>>
>>
>>
>>
>
> --
>
> Ivan Herman, W3C Semantic Web Activity Lead
> Home: http://www.w3.org/People/Ivan/
> mobile: +31-641044153
> PGP Key: http://www.ivan-herman.net/pgpkey.html
> FOAF: http://www.ivan-herman.net/foaf.rdf
>



-- 
Dr. Birte Glimm, Room 306
Computing Laboratory
Parks Road
Oxford
OX1 3QD
United Kingdom
+44 (0)1865 283529
Received on Monday, 12 October 2009 10:41:48 UTC