- From: Andy Seaborne <andy.seaborne@talis.com>
- Date: Wed, 21 Jul 2010 19:27:10 +0100
- To: SPARQL Working Group <public-rdf-dawg@w3.org>
On 20/07/2010 10:45 PM, Kendall Clark wrote:
> I don't think this distinction is that useful (KR, plain SPARQL); but
> since we have customers& apps in both spaces, I'll say that this
> 'signal inference' thing is not a problem in practice. At least, we
> haven't ever encountered it in 5+ years.
>
> If the SD says 'no inference' at 5pm and you turn on inference at 6pm,
> then update the SD at 5:59pm. Seriously, this would just*never*
> happen in practice in my experience. Changing how the system works in
> this way is a configuration management issue& would be handled as
> such (i.e., an entry in a changelog for the new version, a blog post
> on the SaaS blog explaining the new version, etc).
>
> Of course, our experience is not universal, but I suggest that this is
> more of a corner than a core use case and we shouldn't do very much,
> if anything, at this relatively late date about this issue.
>
> Just update the SD before you change the service that it describes.
> Easy to do, simple to explain, good enough. :>
>
> Our two cents and probably worth at least 1 cent at this point. :>
>
> Cheers,
> Kendall
I'm not sure which message to "reply to" but Kendall's points bring
deployed experience into the picture which I think is important here as
it calibrates whether there is a problem now or a potential issue later.
I see a difference between the case of whether the dataset is defined by
the service and when it is defined by the query or protocol (client chosen).
If it is a fixed dataset for the service, the service description can
describe the graphs. It's a property of the graph, not of the access to
the graph via GRAPH nor even an option for the client. It's an offer
the client can choose to accept or not.
The offer can be multi-aspect. Different names can be given to the same
data with different entailment levels (which is especially useful
because the name is capturing the data and the process so you can talk
about "<g1> derived from <g2> by doing process <x>").
Sandro's examples use FROM NAMED, where the client is describing the RDF
dataset. It is the FROM and FROM NAMED that matter, not the GRAPH
access to the data, if we give different names to different views of the
same underlying data.
If data is described in FROM / FROM NAMED and loaded [*] then service
description could be applied through sd:feature, which has a domain of
service, but we haven't defined details. It does let the client have
some control but not as much as, I think, is behind Sandro's concerns
but could say "I apply OWL-DL to anything I see". (Naming is a
potential problem - but the default graph isn't named.)
In DAWG, FROM / FROM NAMED was just a description of the dataset, and
how the data was obtained was not defined. You could reasonably bind
<http://here/g1> to data from <http://there/g2> with expansion done by
parser (Steve's point). None of SPARQL's business how the dataset gets
formed - it is a declaration. This is a bit black/white and the
situation is more complex - DAWG ducked the issue but it was discussed.
If it's the graph description to be modified, not the access, I would
expect:
FROM <http://host/g1> USING RDFS NAMED <tag:mylocalname>
This is closer to the examples on
http://www.w3.org/2009/sparql/wiki/Feature:ParameterizedInference
e.g.
SELECT ?X
FROM <http://xmlns.com/foaf/spec/index.rdf>
FROM <http://www.polleres.net/foaf.rdf>
USING RULESET RDFS
WHERE { ?X a foaf:Agent. }
The other area I have a problem with is the notion that it's just
entailment. Entailment (or rules) is just one process that can be
applied on loading. I don't see a clear dividing line with
client-supplied rules (i.e. inline premises), graph building, or data
cleaning and mangling stages and ETL. For example, to query the union
of graph G1 and G2 but without the data from G3 (yes - someone has asked
for this recently). It's process description that's needed; inference
is one example but just addressing that without seeing a existing
pain-point on the web worries me.
Andy
[*] The FROM/FROM NAMED for the description of the dataset and do not
necessary imply loading from the web. Some systems (TDB; Glitter, I
believe) pick them out of a pool of available graphs but let's just
think of that as a web cache if the names are truly unique and global.
Received on Wednesday, 21 July 2010 18:27:39 UTC