RDF Primer Draft - comments from Antoine Isaac on 2014-01-07 (public-rdf-comments@w3.org from January 2014)

From: Antoine Isaac <aisaac@few.vu.nl>
Date: Tue, 7 Jan 2014 23:46:46 +0100
To: <public-rdf-comments@w3.org>
CC: Guus Schreiber <schreiber@cs.vu.nl>, Yves Raimond <yves.raimond@bbc.co.uk>
Message-ID: <52CC83D6.3060608@few.vu.nl>

Dear Yves, Guus, all,

Trying to follow from a distance what the group is doing, I have read the latest editor's draft of the RDF Primer
http://www.w3.org/TR/2013/WD-rdf11-primer-20131217/

I'm not sure how well it fits your schedule, but I thought I'd share some comments, see below. It may overlaps or conflicts with previous comments from group members... I hope this can help you still.
I find it a really great document now--which means that we should seize the opportunity and make it nearly perfect ;-)

Best regards,

Antoine

----

First a general editorial comment: I like the way notes flag slightly less essential details. But in a text that is quite compact, having this many notes may be counter-productive. Perhaps a couple of them can be integrated in the main text, like the one on IRIs in section 1?

The rest of the comments are in the order of the sections.

- 1, "The Resource Description Framework (RDF) is a framework for describing information about resources in the World Wide Web, such as author and modification time of a Web page or copyright and licensing information of a Web video."
With earlier comments on the group's list still in mind, I must say that having this sentence upfront will lead some to think that RDF is for document annotation. Perhaps it would be good to add a sentence that makes it clear that RDF can also be used for describing real-world entities (persons, etc.).
In the same line, the wording 'describing information about resources' makes me a bit uncomfortable: how about 'describing resources' instead, or 'expressing information about resources' (as it is put later in the text)?

- 1: The URI http://www.example.org/bob is potentially confusing. I am personally ok with it, especially as it follows the sort of examples we've seen in the community for years. But in the light of the discussion on named graphs, perhaps it would be clearer if the IRI was picked to intuitively denote a document or information source, rather than a real-world person (that is now identified by http://www.example.org/bob#me).

- Or perhaps this problem comes from the two sentences after the one where Bob's URI is introduced:
"[...]including the fact that he knows Alice, as identified by her IRI. Retrieving Alice's IRI [...]".
I believe "her IRI" in the first sentence is http://example.org/alice#me, while reading the second sentence one could expect http://example.org/alice (having just being presented http://www.example.org/bob before).
I'm splitting hair obviously, but I can't help thinking that we often use less-intuitive identifiers that make SemWeb documentation much less easy to read.

- 2:"Providing a standard-compliant way for exchanging data between RDF databases."
-> "Providing a standard-compliant way for exchanging data between databases."
(as a use case, the first sentence reads a bit as if RDF had been motivated by the need to exchange RDF data...)

- 3.2, Note on "RDF is agnostic about what the IRI stands for[...]": I'm not sure the reference "RDF vocabularies are discussed in more detail in Sec. 4." belongs there.

- 3.4.: I'm not fond of the sentence "A blank node indicates an un-named thing." One could create a blank node for Bob, who is a named individual in the real world. One could even give an rdfs:label with Bob's name in the RDF graph with this blank node. I'd rather see the paragraph stick to the term 'unidentified'.
Actually you could replace the quoted sentence by "They can be used to denote resources without explicitly identifying them with an IRI." and remove the current last sentence of 3.4.
By the way, isn't it misleading to write "can be used to denote resources [...]"? Isn't it always the case? Or at least what happens in the vast majority of cases? (sorry, I don't have time to check whether I'm missing something obvious here)

- 3.5: I know this section has been discussed, so perhaps my comments will come across as a re-hash, or going against some recent agreement on the text. Sorry if it's the case...
Even though I really want something on named graphs to be said, I really find some points quite hard for a primer:

-- "An RDF dataset may have [...] at most one default graph (i.e. a graph without a name).": do we really need to mention the constraint on the default graph, or even default graphs, in this Primer? I believe that the text could work well without writing about these.

-- "RDF 1.1 does not specify a particular semantics for the relation between the "graph name" and the graph": I know the RDF group has discussed the issue at length, but this sentence sounds a bit like a joke, without any further precision the motivation.
The issue is that the reference to [RDF11-MT, http://www.w3.org/TR/2013/CR-rdf11-mt-20131105/#rdf-datasets] doesn't really work for me: I guess the solution is in the following sentence there: "This allows IRI referring to other kinds of entities, such as persons, to be used in a dataset to identify graphs of information relevant to the entity denoted by the graph name IRI." But I can't parse it and come with a concrete example (ie., an example with realistic IRIs involved in realistic triples) that would show me what's at stake.

-- "RDF provides no way to convey this semantic assumption [...] Those readers will need to rely on out-of-band knowledge to interpret the dataset in the intended way.": here "no way" and "out-of-band" read as if it is impossible to convey the assumptions in RDF at all. As you've discussed, it seems possible to devise appropriate vocabularies (even though it's outside of the standard)...

-- could the last note on named graphs and SPARQL be shortened, and/or become part of the main text? (e.g., put in the first paragraph of the section)

- 4: I would find it easier if the identifiers of classes and properties in the examples were chosen to reflect their type (i.e., "c", "c1", "c2", for classes, "p" for a property, "i" for an instance") rather than their position in the RDFS triples (currently "s" is alternatively used for a class, a property and an instance).
By the way, following your convention, then the second triple should have been "s rdf:type rdf:Property", no?

- 4: it's a bit confusing to find Wordnet in the list of "vocabularies" there. The elements that Wordnet defines are not directly defined as properties and classes in the RDFS sense, unlike the elements of DC, schema.org and SKOS. Shouldn't it be listed in section 7, with the other databases?

- 5: the graphs are really beautiful, but their graphic convention could perhaps be further homogenized, by putting all (typed) literals within brackets, and "Bob" out of its circle, like Alice and Mona Lisa.

- 5.3: space missing after "see Fig. 2)"

- 6: the triple "ex:Species rdf:type rdfs:Class ." is not really necessary I think. The gist of the example is that ex:Elephant is both an instance and a class, not that ex:Species is a class.

- 7. The reference to http://datahub.io/organization/lodcloud could raise problems. The Data Hub's move from 'groups' to 'organizations' and the fact that a dataset can be in only one organization has resulted many datasets disappearing from their original grouping. I'm afraid the same thing may have happened for the LODCloud group. The LODCloud group still includes RDF datasets and can be used as a source of example, but I believe it's not representing the most recent LOD Cloud as we know it at http://lod-cloud.net/state/.

Received on Tuesday, 7 January 2014 22:47:19 UTC