Re: [TF-ENT] Querying datasets with default plus named graphs from Ivan Herman on 2009-10-12 (public-rdf-dawg@w3.org from October to December 2009)

From: Ivan Herman <ivan@w3.org>
Date: Mon, 12 Oct 2009 13:05:51 +0200
To: Birte Glimm <birte.glimm@comlab.ox.ac.uk>
CC: SPARQL Working Group <public-rdf-dawg@w3.org>
Message-ID: <4AD30D8F.6070007@w3.org>
Birte Glimm wrote:
> [snip]
>> Just for my understanding, based on your latest text (thanks for having
>> added it, b.t.w.!)... if I have
>>
>> <A> standing for the graph
>> :p rdfs:range :AA
>>
>> <B> standing for the graph
>> :p rdfs:domain :BB
>>
>> <C> standing for the graph
>> :x :p :y
>>
>> then the query:
>>
>> SELECT ?g
>> FROM NAMED <A>
>> FROM NAMED <B>
>> FROM <C>
>> WHERE {
>>   GRAPH ?g { :y a ?type }
>> }
>>
>> will return ?g-><A>, right?
> 
> I would not say so. In this query you have not merged that data from
> the three graphs that you consider. Your query will go through all
> three graphs (the 2 named and the default graph) and try to find a
> binding for the variables in each graph without considering the data
> from the other graphs. I.e., you can first try ?g-><A>, but <A> alone
> does not provide a binding for ?type and entailment cannot do much if
> you have only the triple :p rdfs:range :AA. Then you go on to ?g-><B>,
> but again, the triples from <B> alone do not give a binding for ?type.
> Now you try the default graph, but that alone does also not give any
> type information and there is no answer for the query.
>

Ah. Well, my mental model is clearly wrong... I will have to go back to
the SPARQL spec to understand how the interaction of named and unnamed
graphs work in this sense.

[snip]

>> A practical example may then be another type of request which is
>>
>> SELECT ?type
>> FROM NAMED <A>
>> FROM NAMED <B>
>> FROM <C>
>> WHERE {
>>   GRAPH <A> { :y a ?type }
>> }
>>
>> which, essentially, specifies a specific vocabulary for a portion of the
>> query, right? (It may be worth adding this example to the text, too, to
>> make the situation clearer.)
> 
> Again no answer I would say because FROM NAMED <B> and FROM have not
> really an effect now (but they would not give an answer anyway as
> argued above). The only way you can get your entailment would be
> 
> SELECT ?g
>  FROM <A>
>  FROM <B>
>  FROM <C>
>  WHERE {
>     :y a ?type .
>  }
> Here you merge all data from A, B, and C into a new default graph and
> that gives the answer (y->:AA).
>

Yeah, this is the effect of the same faulty model in my mind...

>> B.t.w. a practical consequence (if all this is true) is that the user
>> will have to specify explicitly all the vocabularies it uses in terms of
>> FROM or FROM NAMED clauses to get the right entailements. Which is an
>> unfortunate duplication of the @prefix clauses. Ie, one will have to write
> 
> You might not need prefixes for everthing. Prefixes you will only need
> for the things that you mention in the query, e.g., in the above query
> you would only need a prefix for :y. But for the example that you give
> below, you are right. Maybe one can add something like AS pre or
> prefix pre to the FROM clause?
> SELECT *
>   FROM <URI-FOR-DC> ** AS dc ** or ** prefix dc **
>   WHERE {
>     ... something that involves an RDFS entailement with dc:
>   }
> That would be an extension to the query language itself, so I don't
> know whether the group would want to consider such extras.
> 


That is some sort of a merge of the PREFIX and FROM. I am not sure we
should go down that road... We should simply accept that, in the case of
inferences, the PREFIX and FROM clauses might have to be duplicated...


>> @prefix dc: <URI-FOR-DC>
>> SELECT *
>> FROM <URI-FOR-DC>
>> WHERE {
>>   ... something that involves an RDFS entailement with dc:
>> }
>>
>> My reference to the owl:import in my earlier mails is that this may
>> become easier when using owl, because one can prepare one RDF files that
>> says
>>
>> <> owl:import <URI-FOR-DC> .
>> <> owl:import <URI-FOR-SOMETHING-ELSE>
>>
>> and make a unique FROM in the query on that file; OWL entailement may
>> process the owl:import clauses before making the entailement. Somewhat
>> simpler for the users.
> 
> It should process the imports before because you have to load all
> axioms from the imports and consider them for entailments in OWL.

Yes, that is what I meant.

Ivan

> 
>> (Note that OWL 2 RL does not define owl:import as one of its accepted
>> terms, although OWL 2 Full does. I wonder whether this is not a simple
>> extension of OWL 2 RL that we should allow... Not sure...)
> 
> also not sure...
> 
> Birte
> 
>> Cheers
>>
>> Ivan
>>
>>
>>
>> Birte Glimm wrote:
>>> [snip]
>>>>> As I understand it, from named can be used to access graphs in the
>>>>> data set of the query processor. You can do merges into a fresh
>>>>> default graph. Even though this might not be nicest thing in
>>>>> particular for some entailment regimes, this is something that needs
>>>>> to be addressed in the SPARQL query document. The requirement might
>>>>> come from entailment regimes, but entailment regimes are based on
>>>>> SPARQL and if SPARQL does not define it, then we cannot use it. I
>>>>> personally do not want to raise an issue and a request for that, but
>>>>> if others feel like doing it...
>>>> I must say I am  a little bit mixed up here, maybe you can help... We discussed the
>>>> issues of restricting entailements specific graphs when those graphs are defined through
>>>> the named graph mechanism of sparql. But I am now messed up on how the FROM NAMED and
>>>> the GRAPH statements would exactly influence entailement, ie when is anything
>>>> restricted. Could you try to summarize this for a better understanding? Maybe this is
>>>> where my confusion comes from... but I am lost a bit:-(
>>> I added a section on this into the entailment regimes doc:
>>> http://www.w3.org/2009/sparql/wiki/Design:EntailmentRegimes#Entailment_Regimes_and_Data_Sets
>>> but I have the impression that it will not answer your question.
>>> Basically, triples in one graph of the data set do not have any
>>> influence on any other graph in the data set. For a system supporting
>>> RDFS entailment, for examle, you could take the triples from one RDF
>>> document, load it into graph A, built a partial RDFS closure (using
>>> the ter Horst algorithm) and answer queries by using simple entailment
>>> on the partial RDFS closure. Now if you additionally load the triples
>>> from another RDF document into graph B, then this has no influence on
>>> graph A, so even if graph A contains
>>> :a rdf:type :B . (inferred or stated in the originally loaded document)
>>> and the document loaded into graph B contains
>>> :B rdfs:subClassOf :C
>>> you cannot use this to get
>>> :a rdf:type :C .
>>> as a query answer from graph A. The triples in one graph are not
>>> visible in another graph.
>>> I am not quite sure what you mean with "restricting entailments
>>> specific graphs". Do you have in mind that a query processor provides
>>> a certain data set description, say with some default graph, graph A,
>>> and graph B, and one of the named graphs, say A, is for queries with
>>> RDFS entailement, while the other one (B) is for queries with simple
>>> entailment?
>>> At the moment that would not be possible in my understanding and, in
>>> general, the ways of choosing what entailment regime you want seems
>>> not very flexible (but I might overlook something). Let us assume you
>>> have a query processor that can do simple, RDF, and RDFS entailment
>>> (not too unreasonable I think). As I understand it, that would mean
>>> that you can have three endpoints, one for each entailment regime and
>>> depending on which endpoint I choose when I query, I get one of the
>>> three entailment regimes and I can ask that endpoint via service
>>> descriptions what data sets it has etc. What we cannot do at the
>>> moment (if I understand it correctly) is to mix entailment regimes in
>>> one endpoint, so you cannot say the your query should contain results
>>> from graph A under RDFS entailments unioned/joined with results from
>>> graph B for the graph B results you want simple entailment. There is
>>> no way to specify that in the query and there is no way for an
>>> endpoint to communicate that it will use simple entailment for some
>>> data set and RDF(S) for another.
>>> Provided I get that right, I am not sure how much of an issue that is.
>>> I can live with it, but that is my personal opinion.
>>>
>>> For OWL I can see just what you mention above as something that needs
>>> to be addressed, i.e., how can users query for things that are not
>>> entailed, but are stated in the ontology and that are important to
>>> users (annotations most notable, but imports also fall into this
>>> category). If we allow some way of specifying in a query that some
>>> part of the query has to be evaluated under one entailment regime and
>>> other parts of the query under other regimes, that is fine. Then you
>>> can use simple entailment for annotations and OWL or whatever for the
>>> rest. If we do not want to go that way, we could also define OWL
>>> entailment in a way that does not employ OWL semantics to annotation
>>> queries. That is not as nice in my opinion, but it would be a
>>> workaround that does not require changes in other specs.
>>>
>>> Birte
>>>
>>>> [snip]
>>>>>> And what you say is perfectly o.k. in view of the RIF specification.
>>>>>> However: in SPARQL, FROM and FROM NAMED are defined  to specify RDF
>>>>>> datasets. OWL and RDFS are (or can be expressed in) RDF. RIF rules cannot.
>>>>>>
>>>>>> That actually may create problems for OWL, too. There is no problem if the
>>>>>> OWL ontology in the FROM clause is in RDF. But would the spec allow to refer
>>>>>> too OWL ontologies in functional and/or Manchester syntax via the FROM or
>>>>>> FROM NAMED clauses?
>>>>> Question to the SPARQL implementors/experts. Can I specify my RDF data
>>>>> in turtle and query that in accordance with the spec? If not in
>>>>> accordance with the spec, do systems support turtle input?
>>>>> If yes, then I cannot see, why not functional or manchester syntax.
>>>>> This is obviously not normative. Any system might reject non-RDF-XML
>>>>> input, but many systems might happily take it.
>>>>> If not even turtle is allowed, are there any plans for doing that as
>>>>> an optional syntax? If not, I guess we have to live with RDF XML. That
>>>>> would probably be the end for RIF though, for OWL RDF ML is normative
>>>>> and any conformant system must support it anyway, so it is not as bad
>>>>> for OWL.
>>>>>
>>>> Hm (again:-). Yes, you are actually right, I am not sure the spec says anything. My
>>>> impression is that the spec is silent at that point and a URI to a graph amy refer to
>>>> any format that the processor understands. If that is so, we may not have a problem with
>>>> OWL if the processor understands non RDF/XML formats. Maybe it is worth to add this to a
>>>> possible service descriptions, though.
>>>>
>>>> But it is certainly a problem with RIF. Indeed, turtle may not be a standard format but
>>>> it is an RDF serialization syntax. In this sense, both the OWL 2 functional syntax and
>>>> the M'ter syntax can be considered as an RDF serialization syntax, because they can be
>>>> converted, in a standard way, to RDF. But an RIF rule set _cannot_:-(
>>>>
>>>> Thanks
>>>>
>>>> Ivan
>>>>
>>>>>> I would expect we should be able to do that, but that might affect the query
>>>>>> language specification.
>>>>> Again, that is up to the general SPARQL/Query spec and however want to
>>>>> raise an issue for that can do so.
>>>>>
>>>>> Birte
>>>>>
>>>>>> I remember Axel and I had some corridor chat at some point that would allow
>>>>>> adding a media type to the FROM (NAMED) clause...
>>>>>>
>>>>>> Ivan
>>>>>>
>>>>>>> Birte
>>>>>>>
>>>>>>>> Ivan
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>> Ivan Herman, W3C Semantic Web Activity Lead
>>>>>>>> Home: http://www.w3.org/People/Ivan/
>>>>>>>> mobile: +31-641044153
>>>>>>>> PGP Key: http://www.ivan-herman.net/pgpkey.html
>>>>>>>> FOAF: http://www.ivan-herman.net/foaf.rdf
>>>>>>>>
>>>>>>>
>>>>>> --
>>>>>>
>>>>>> Ivan Herman, W3C Semantic Web Activity Lead
>>>>>> Home: http://www.w3.org/People/Ivan/
>>>>>> mobile: +31-641044153
>>>>>> PGP Key: http://www.ivan-herman.net/pgpkey.html
>>>>>> FOAF: http://www.ivan-herman.net/foaf.rdf
>>>>>>
>>>>>
>>>>> --
>>>>> Dr. Birte Glimm, Room 306
>>>>> Computing Laboratory
>>>>> Parks Road
>>>>> Oxford
>>>>> OX1 3QD
>>>>> United Kingdom
>>>>> +44 (0)1865 283529
>>>>>
>>>> --
>>>> Ivan Herman, W3C Semantic Web Activity Lead
>>>> URL: http://www.w3.org/People/Ivan/
>>>> FOAF: http://www.ivan-herman.net/foaf.rdf
>>>>
>>>>
>>>
>>>
>> --
>>
>> Ivan Herman, W3C Semantic Web Activity Lead
>> Home: http://www.w3.org/People/Ivan/
>> mobile: +31-641044153
>> PGP Key: http://www.ivan-herman.net/pgpkey.html
>> FOAF: http://www.ivan-herman.net/foaf.rdf
>>
> 
> 
> 

-- 

Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
PGP Key: http://www.ivan-herman.net/pgpkey.html
FOAF: http://www.ivan-herman.net/foaf.rdf
Received on Monday, 12 October 2009 11:06:20 UTC