- From: Sandro Hawke <sandro@w3.org>
- Date: Fri, 07 Oct 2011 12:52:13 -0400
- To: Richard Cyganiak <richard@cyganiak.de>
- Cc: RDF Working Group WG <public-rdf-wg@w3.org>
On Fri, 2011-10-07 at 16:57 +0100, Richard Cyganiak wrote: > On 7 Oct 2011, at 15:07, Sandro Hawke wrote: > >> The way I see it, the proposal is backwards. We want to specify an abstract, idealised information space – a collection of RDF graphs. How to access that information space (dereferencing, SPARQL, N-Quads dumps, etc) is an implementation detail that's subject to pragmatic decision, unreliable networks, fads and fashions, and so on. When specifying the abstract information space, you cannot define it in terms of its access implementation. Just define the abstract information space, and leave it to the market to decide on how to access it. > > > > You may well be right about this, but I'd like to see how far we can > > push it and what we'd get from doing so. > > What's your reason for wanting to make this normative rather than just a declared good practice? Because I want systems to be able to rely on it. I want people to be able to write apps which refer to graphs (really g-boxes) by a single URI, etc. When those apps are dealing with datasets -- via TriG, SPARQL, whatever -- I want them to be able to assume that graph name is still talking about the same g-box. I agree we need a transition plan -- we can't just make datasets out there have a different meaning by fiat. And I'm not sure exactly where the line is between a "good practice" and a "W3C Recommendation". But if there's something we think folks should be doing, in certain circumstances, let's specify that behavior, and give people a way to signal they are doing it, so others can start to build on it. > I take it that a TriG file would be non-conforming if it contains a named graph that doesn't match what you get by dereferencing? > > Let's say I have a TriG file <x.trig>. Now let's assume a scenario A where it is conforming (the web matches its contents) and a scenario B where it's non-conforming (the web doesn't match its contents). What observable difference in the behaviour of software would you like to see? If folks were using Web semantics for datasets, and if we can tame the temporal validity issues, then consumers could use data that came in via datasets. For instance, if sig.ma fetched that TriG document from source <t>: <u> { <s> <p> <o>. } then it wouldn't have to dereference <u>. It could just add <s> <p> <o> tagged for trust/provenance as coming from the combination of sources <u> and <t>. This isn't the most compelling use case -- it's just pre-loading a web cache -- but hopefully it answers your question. If the site giving sig.ma that TriG document is conforming, then sig.ma presents its users with the right data; if the site providing the TriG document has some other notion of what datasets mean, or is buggy or otherwise non-conforming, it could well result in sig.ma giving its users bad data (even though everyone is being good). So, a transition plan might be that we have two media types for TriG, one for when you're using Web semantics and one for when you're not. Sig.ma would only consume the datasets like this when Web semantics were flagged as being used. This is pretty clumsy, but it would technically work. > For example, would a TriG parser generate the same RDF dataset in both cases or not? > > Would a SPARQL processor answer all queries in the same way or not? > > Would it entail the same additional triples/quads under the various levels of RDF/S and OWL entailment or not? This issue is sort of higher-level than all that, and doesn't affect that stuff. > > I think we can make it a lot more crisp than AWWW. > > That sounds like TAG business to me. I don't think anyone outside the RDF community cares how the names in named graphs work. The TAG isn't going to solve this for us and isn't going to mind if we solve it in a sensible way. > >> The relationship between <u,G> in a named graph shouldn't be “dereferencing u yields G”. It should be “owner of u gets to say what's in G”, which already *is* the case per AWWW, so we don't actually need to say anything about that when specifying <u,G>. > > > > Can you say more about this? I don't understand. That seems even more > > abstract than dereference. > > It says: “Good practice: don't squat in other people's namespaces.” > > > In practice, the owner of u gets to > > control what happens when folks dereference it, but without dereference > > I'm not sure the world cares who the "owner" is or really gives them any > > special rights. > > That sounds completely wrong to me. In practice, the social convention of URI ownership is relied on in many places without assuming (or sometimes while forbidding) dereferencing: XML namespaces, URNs, microdata vocabulary URIs, … Okay, I see what you mean. Socially, for human interpretation, yes the URI owner is granted some rights. I think there is a community of developers that scoffs at that idea, but that's a big digression. So, when you said this: The relationship between <u,G> in a named graph shouldn't be “dereferencing u yields G”. It should be “owner of u gets to say what's in G”, which already *is* the case per AWWW, so we don't actually need to say anything about that when specifying <u,G>. were you (1) arguing for a different way to frame Web Semantics for Datasets or (2) arguing what the Semantics for Datasets in RDF should be? I first thought it was 2, which seemed like a big change for you, so now I think it was 1. -- Sandro > Best, > Richard
Received on Friday, 7 October 2011 16:52:22 UTC