Re: comments on 17 December 2013 WD of RDF 1.1 Primer

Bob,

Thanks a lot for this extensive review, very much appreciated! We will 
get back to you soon with a detailed response.

Best,
Guus


On 31-12-13 00:25, Bob DuCharme wrote:
> There's a lot of good stuff in it, but because it's a Primer, I assume
> that its intended audience is people who are new to RDF, and the
> document often assumes too much about the reader's knowledge of
> technical specification vocabulary.
>
> I've divided up my comments into two lists: comments about substance
> followed by picky copyediting suggestions. Suggestions often show a
> quoted phrase from the Primer followed by a suggested revision. For
> example,
>
>    "cypress-tree" cypress tree
>
> is a suggestion to replace "cypress-tree" with "cypress tree".
>
>
> === substantive (to varying degrees)  ===
>
> Section 1. says "The Resource Description Framework (RDF) is a framework
> for describing info about resources in the World Wide Web." 1.1 says
> that "An IRI identifies a web resource" and then references
> http://www.ietf.org/rfc/rfc3987.txt, but I couldn't find anything in
> that RFC about IRIs being limited to the identification of web
> resources. I know that URLs define web resources, but if I assign an IRI
> to the chair I'm sitting in, couldn't I use RDF to state facts about the
> chair's location, manufacturer, etc., without this having anything to do
> with the web? Or am I misunderstanding something? I always thought that
> we could assign IRIs to absolutely anything and then use RDF to describe
> them; limiting its use to web-based resources really limits its power.
>
> 3.1 "Resources typically occur in multiple triples, for example Bob and
> the Mona Lisa painting in the examples above." The Mona Lisa resource
> only occurs in one triple above this sentence, not two, unless you want
> the reader to assume case insensitivity in the sample data, which I
> think is a bad idea. I would capitalize <the Mona Lisa> consistently and
> then explicitly point out how the same resource can appear in the
> subject of one triple and the object of another, which is a new idea at
> this point of the Primer. (A wonderful new idea!) After normalizing the
> capitalization, the sentence might be better off like this: "The same
> resource is often referenced in multiple triples. In the example above,
> Bob is the subject of four triples, and the Mona Lisa is the subject of
> one and the object of another. This ability to have the same resource be
> in the subject position of one triple and the object position of another
> makes it possible to find connections between triples, which is an
> important part of RDF's power. We can therefore visualise triples as..."
>
> "The example above... an RDF graph" Move that paragraph before the
> "Resources typically" paragraph, i.e. right after the example itself,
> maybe in one of the green "NOTE" blocks.
>
> In the note that begins
>
>    The RDF Data Model is described in this section in the form of an
> "abstract syntax"
>
> do "encoding" and "concrete RDF syntax" refer to the same thing? If so,
> make that clearer. I think it would be better off to never use the word
> "encoding," which people are more likely to associate with things like
> UTF-8 vs. Latin 1, and instead use the term "concrete syntax"
> consistently. The first time the Primer uses the phrase "concrete
> syntax," a parenthesized phrase after it could say something like "(the
> syntax used to represent triples stored in text files)", because as a
> Primer this should provide more hints about the meaning of highly
> technical phrases. These same issues come up in the paragraph of Section
> 5 beginning "Many different concrete syntaxes..."
>
> "three types of RDF data that occur in triples" three types of RDF
> resources that occur in triples
>
> "The notion of IRI is a generalization of URI (Uniform Resource
> Identifier)" To assume that someone who doesn't understand RDF (the
> intended audience of the Primer) understands what URIs are and their
> relationship to URLs is a huge, huge  assumption. How about adding,
> after the sentence with this, something like "The URLs (Uniform Resource
> Locators) that people use as web addresses are one form of URI, with an
> important difference: URIs are not necessarily locators that provide the
> address of a resource; they are often merely identifiers that provide a
> unique ID for a given resource. IRIs are a generalization of this
> because..."
>
> 3.2 "RDF is agnostic about what the IRI stands for" Unlike section 5.1
> ("in this example foaf:Person stands for
> <http://xmlns.com/foaf/0.1/Person>") I think that "stands for" is not
> appropriate here. (After all, IRI stands for "International Resource
> Identifier.")  "Represents" or "identifies" would be better.
>
> 3.4 I don't think algebra variables are a very good analogy here. Those
> are named things that may not have values, and blank nodes are unnamed
> things that do have values.
>
> Section 3.4 overall is a little too brief and abstract for an RDF
> neophyte. Blank nodes are a difficult concept for people who are new to
> RDF. Either don't cover them in the Primer or cover them a bit more. For
> example, this section would greatly benefit from a new diagram similar
> to the one in Figure 1 that includes the cypress tree.
>
> Also: 'Resources such as the unidentified cypress tree are called "blank
> nodes" in RDF.' The resource (the tree, in this case) is not called a
> blank node. How about this: 'Resources without identifiers such as the
> painting's cypress tree can be represented by "blank nodes" in RDF.'
>
> 3.5 "does not specify a particular semantics" That's normative
> spec-speak, not primer-speak, and should be reworded to be clearer to
> beginners. A bit later, the "i.e." parenthetical remark after "RDF
> provides no way to convey this semantic assumption" provides a good
> model of connecting this high-level talk of semantics to the actual data
> being discussed.
>
> Section 1 said that "For example retrieving http://www.example.org/bob
> could provide data about Bob," leading me to believe that this URI
> represented the resource Bob. In the section on named graphs, the same
> URI represents a named graph, not a person. I understand that this
> doesn't invalidate the "For example" sentence--if it's the name of a
> graph, retrieving it could still "provide data about Bob"--but I think
> this can still confuse the RDF beginner, and recommend that the examples
> in the section on named graphs use new IRIs that have not appeared in
> the Primer before.
>
> "In the example default (unnamed) graph below we see two triples that
> have a graph name as subject:" Insert a sentence before this about why
> someone would want to do this, e.g. "When you can reference a graph with
> a IRI, you can create triples that provide metadata about that graph."
>
> "subsets of triples" doesn't make sense. "subsets of a dataset [ or
> collection] of triples"?
>
> 4. "For example, one can state that the IRI ex:friendOf can be used as a
> property" the idea of this being an IRI will come as a complete surprise
> to the reader, because the use of prefixes hasn't been discussed at all
> yet. (Is a qname considered an IRI?) The original RDF Primer at
> http://www.w3.org/TR/rdf-primer/ has a good paragraph beginning "The
> full triples notation requires" that introduces this well. However it's
> done, as a Primer this should explain any new syntax, such as the use of
> namespace prefixes, before using that syntax.
>
> "domain respectively range restrictions" domain and range restrictions,
> respectively (The sentence with this is another example of assuming a
> pre-existing, strong understanding of the relevant technical vocabulary
> by the reader; the Primer really should have a few more sentences to
> explain the use of rdfs:domain and rdfs:range, which is always a
> difficult point with RDF beginners.)
>
> After Example 2 add something like this, because the idea of (and value
> of!) properties as subjects or objects in triples has not been covered
> at all up to this point and often comes as a surprise to people with an
> object-oriented background: "Note that, while <is a friend of> is a
> property typically used as the predicate of a triple (as it was in
> Example 1), properties like this are themselves resources that can be
> described by triples or provide values in the descriptions of other
> resources. In this example, <is a friend of> is the subject of triples
> that assign type, domain, and range values to it, and it's the object of
> a triple that describes something about the <is a good friend of>
> property."
>
> "RDFa (for HTML embedding)" I always think it's a shame that people
> think that RDFa is only for use with HTML. It can be very useful with
> other kinds of XML as well; see
> http://www.devx.com/semantic/Article/42543 . I would love to see the
> several references to this say "for HTML and XML embedding."
>
> Section 5.1 is more like a quick reference of Turtle syntax than a
> Primer, because it covers so much so quickly. Readers who are new to RDF
> (the intended audience of this document) will find it confusing. A brief
> introduction to N-Triples before the Turtle part would make the Turtle
> part much easier to understand, because then the reader will understand
> that the use of angle brackets around full IRIs, quotes around literals,
> and a period after each triple are the most important parts of the
> syntax and that everything else in Turtle is just a syntactical
> convenience.
>
> "the predicate-object part of triples with <http://example.org/bob#me>
> as subject"  the predicate-object part of triples that have
> <http://example.org/bob#me> as their subject
>
> "The semicolons at the end of lines 9-11 indicate that the set is not
> yet complete. A period is used to signal the end of a Turtle statement."
> The use of "set" here is confusing. Set of what? I know that it refers
> to predicate-object pairs associated with a common subject, but someone
> new to Turtle might think that it's some specific Turtle construct. I
> think it would be better to say "The semicolons at the end of lines 9-11
> each indicate the the predicate-object pair that follows them is part of
> a new triple that uses the most recent subject shown in the data--in
> this case, <bob#me>."
>
> 'The term _:x is a blank node. It represents some unnamed tree depicted
> in the Mona Lisa painting and belonging to the "Cypress" class.'  The
> term _:x is a blank node. It represents an unnamed resource depicted in
> the Mona Lisa painting that is an instance of the "Cypress" class. [It's
> safer to say that it represents a resource, not a tree, and the idea of
> "belonging" here is not quite accurate.]
>
> === copyediting ===
>
> There are several places where "for example" should have a comma after
> it: "For example retrieving", "For example a dataset about paintings",
> "For example 'Léonard de Vinci'",
>
> 3.1 " <subject>  <predicate> <object>" has an extra space after <subject>
>
> - "multiple triples, for example" [em dash not comma]
>
> "allow writing literals" allow writing of literals
>
> "markup webpages": "mark up" should be two words when used as a verb.
> I'd say "web pages" as two words as well.
>
> "Library of Congress published its..." The Library of Congress published
> its
>
> The phrase "Using the Web Ontology Language" would make sense, but
> "Using the OWL" in section 4 does not. Just say "Using OWL."
>
> "a RDF vocabulary" an RDF vocabulary
>
> "the reader is referred to the Turtle document" see the Turtle document
>
> " the reader can find for each RDF syntax corresponding"  the reader can
> find, for each RDF syntax, corresponding
>
> "cypress-tree" cypress tree
>
> "cater for" cater to [although that could just be a British vs. American
> usage thing]
>
> "semantics which is specified in the RDF" semantics which are specified
> in the RDF
>
> "Wikidata, a free, collaborative..." end that bullet point with a period
> like the other bullets in that list.
>
>
> Thanks,
>
> Bob DuCharme
>

Received on Wednesday, 1 January 2014 14:21:02 UTC