Re: Comments on RDF Spaces document from Ivan Herman on 2012-05-28 (public-rdf-wg@w3.org from May 2012)

From: Ivan Herman <ivan@w3.org>
Date: Mon, 28 May 2012 14:11:04 +0200
To: Richard Cyganiak <richard@cyganiak.de>
Cc: Sandro Hawke <sandro@w3.org>, RDF Working Group WG <public-rdf-wg@w3.org>
Message-Id: <F69CEFF9-DA0C-4403-A490-FBCAC0C8DB0B@w3.org>
On May 28, 2012, at 10:58 , Richard Cyganiak wrote:

> Hi Ivan,
> 
> On 26 May 2012, at 16:19, Ivan Herman wrote:
>>> I strongly disagree with defining “space” based on the nature or characteristics of the thing identified. Strongly disagree with the whole “container” metaphor. The definitions mean that if I produce some triples by running some NLP on web pages, I'm not allowed to stick that into a SPARQL store using the web page URL as graph name. This is not acceptable to me.
>> 
>> I am not sure that your last objection is actually fair.
> 
> My objection is fair.
> 
>> The text discloses NLP, but only because extraction with NLP is too vague (the results would depend on the NLP engine).
> 
> Of course the results would depend on the NLP engine, but nevertheless they are a true representation of the meaning that is conveyed by the web page.

True. But the term 'NLP' is too vague. I believe that if we say 'NLP by Zemanta', or 'NLP by OpenCalais' (with appropriate URI-s) then this is perfectly all right (at least to me). And anyway: this is just an example.

> 
>> I believe, taking a different example, that the current "space" notion would allow an HTML5+microdata page being a space (ie, its URI being the identifier of the space), because there is a well defined algorithm that produces RDF triples.
> 
> That is true but besides the point. The point is that Sandro's text, as I understand it, forbids the use of RDF datasets with NLP, and, as far as I can tell, forbids the use of any non-W3C-recommended method of turning some contents into triples.

Well, that is not the way *I* understand it. The only definition is:

[[[
An RDF space is anything that can reasonably be said to explicitly contain zero or more RDF triples and has an identity distinct from the triples it contains. 
]]]

everything else are examples, pro or con.

> 
>> To come back to your text, if the URI of the text made it clear that that the NLP extraction is based on, say, Zemanta's tagging engine then, for me, that would be a perfectly o.k. name for a space.
> 
> It is explicitly an example that is *not* an o.k. graph name in Sandro's definition.

The text says:

[[
Natural language text. While it might be possible extract some of the meaning of the text and express that meaning in RDF triples, those triples are not explicit and in practice might vary from one extractor to the next.
]]

which is the the reason I quoted above. And, actually, this is a bit besides the point. I do not read this as excluding NLP in principle with the caveats of making it clear which extractor is used.

> 
>> In other words: apart from a terminological mismatch, I do not think that the differences are so utterly huge as you seem to say.
> 
> My point is that any definition that explicitly outlaws any currently conforming use of RDF Datasets in SPARQL is not acceptable.
> 

If the text indeed outlawed this, I would agree. But I do not see how that current texts outlaws it.


>> One the other hand: shouldn't we, somewhere in our documents (remember that I look at this document as a 'gathering place') define quads? After all, they *are* widely used, and some sort of a relationships to named graphs should be defined somewhere.
> 
> I don't see why. The only spec that has any reason to mention quads is N-Quads. (Well, JSON-LD may too but it uses a definition that's different from Sandro's.) Other uses of quads are implementation strategies and those don't belong into the specs.

Correct. My question was whether this WG would define NQuads as well or not. If we do define NQuads (and I do not believe this has been decided pro or con) then we have to properly define Quads and that in relations to any formalism we have on named graphs. If we decide that NQuads are not to be formally defined by this WG, then indeed this section may become unnecessary.

Ivan

> 
> For example we don't mention prefix maps either, despite the fact that every RDF implementation has something like that in the codebase.
> 
>>>> 3.6 Graph Store
>>> 
>>> Not entirely convinced that we need both the “snapshot” and “mutable” versions of the abstract syntax. Why do you think we do?
>> 
>> Hm. My reading of this section is how the rest relates to what SPARQL 1.1 defines (not being 100% up-to-date with that part of SPARQL 1.1, I cannot judge whether what is in this section is correct or not).
> 
> Sure, this section simply copies what's in SPARQL Update, and is consistent with SPARQL Update, but if graph stores are purely a SPARQL Update thing, then I don't see why we need to talk about it. SPARQL Update already explains the relationship between graph stores and RDF datasets.
> 
> The way I see it: As a rule of thumb, once a concept is used not just in one spec but across a larger number of Semantic Web specs, then it is a candidate for being moved into the core specs. I am not sure whether that's the case for the “graph store” concept. Maybe it is the case, but I'd like to see someone making the case.
> 
> Best,
> Richard


----
Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
FOAF: http://www.ivan-herman.net/foaf.rdf
Received on Monday, 28 May 2012 12:07:21 UTC