Re: comments on 17 December 2013 WD of RDF 1.1 Primer from Bob DuCharme on 2014-01-29 (public-rdf-comments@w3.org from January 2014)

From: Bob DuCharme <bob@snee.com>
Date: Wed, 29 Jan 2014 13:17:38 -0500
To: Guus Schreiber <guus.schreiber@vu.nl>, public-rdf-comments@w3.org
Message-ID: <52E945C2.1000301@snee.com>
Looks good, thanks!

Bob

On 1/28/2014 5:12 PM, Guus Schreiber wrote:
> Bob,
>
> Thanks again for your comments. Responses inline.
>
> On 31-12-13 00:25, Bob DuCharme wrote:
>> There's a lot of good stuff in it, but because it's a Primer, I assume
>> that its intended audience is people who are new to RDF, and the
>> document often assumes too much about the reader's knowledge of
>> technical specification vocabulary.
>>
>> I've divided up my comments into two lists: comments about substance
>> followed by picky copyediting suggestions. Suggestions often show a
>> quoted phrase from the Primer followed by a suggested revision. For
>> example,
>>
>>    "cypress-tree" cypress tree
>>
>> is a suggestion to replace "cypress-tree" with "cypress tree".
>>
>>
>> === substantive (to varying degrees)  ===
>>
>> Section 1. says "The Resource Description Framework (RDF) is a framework
>> for describing info about resources in the World Wide Web." 1.1 says
>> that "An IRI identifies a web resource" and then references
>> http://www.ietf.org/rfc/rfc3987.txt, but I couldn't find anything in
>> that RFC about IRIs being limited to the identification of web
>> resources. I know that URLs define web resources, but if I assign an IRI
>> to the chair I'm sitting in, couldn't I use RDF to state facts about the
>> chair's location, manufacturer, etc., without this having anything to do
>> with the web? Or am I misunderstanding something? I always thought that
>> we could assign IRIs to absolutely anything and then use RDF to describe
>> them; limiting its use to web-based resources really limits its power.
>
> Reformulated as:
>
> [[
>     The Resource Description Framework (RDF) is a framework for
>     expressing information about <strong>resources</strong>. Resources
>     can be anything, including documents, people, physical objects, 
> and abstract
>     concepts.
> ]]
>
>> 3.1 "Resources typically occur in multiple triples, for example Bob and
>> the Mona Lisa painting in the examples above." The Mona Lisa resource
>> only occurs in one triple above this sentence, not two, unless you want
>> the reader to assume case insensitivity in the sample data, which I
>> think is a bad idea. I would capitalize <the Mona Lisa> consistently and
>> then explicitly point out how the same resource can appear in the
>> subject of one triple and the object of another, which is a new idea at
>> this point of the Primer. (A wonderful new idea!) After normalizing the
>> capitalization, the sentence might be better off like this: "The same
>> resource is often referenced in multiple triples. In the example above,
>> Bob is the subject of four triples, and the Mona Lisa is the subject of
>> one and the object of another. This ability to have the same resource be
>> in the subject position of one triple and the object position of another
>> makes it possible to find connections between triples, which is an
>> important part of RDF's power. We can therefore visualise triples as..."
>
> Changed as suggested. Note that the Mona Lisa occurs in two object 
> positions.
>
>> "The example above... an RDF graph" Move that paragraph before the
>> "Resources typically" paragraph, i.e. right after the example itself,
>> maybe in one of the green "NOTE" blocks.
>
> Changed as suggested.
>
>> In the note that begins
>>
>>    The RDF Data Model is described in this section in the form of an
>> "abstract syntax"
>>
>> do "encoding" and "concrete RDF syntax" refer to the same thing? If so,
>> make that clearer. I think it would be better off to never use the word
>> "encoding," which people are more likely to associate with things like
>> UTF-8 vs. Latin 1, and instead use the term "concrete syntax"
>> consistently. The first time the Primer uses the phrase "concrete
>> syntax," a parenthesized phrase after it could say something like "(the
>> syntax used to represent triples stored in text files)", because as a
>> Primer this should provide more hints about the meaning of highly
>> technical phrases. These same issues come up in the paragraph of Section
>> 5 beginning "Many different concrete syntaxes..."
>
> Changed as suggested.
>
>> "three types of RDF data that occur in triples" three types of RDF
>> resources that occur in triples
>>
>> "The notion of IRI is a generalization of URI (Uniform Resource
>> Identifier)" To assume that someone who doesn't understand RDF (the
>> intended audience of the Primer) understands what URIs are and their
>> relationship to URLs is a huge, huge  assumption. How about adding,
>> after the sentence with this, something like "The URLs (Uniform Resource
>> Locators) that people use as web addresses are one form of URI, with an
>> important difference: URIs are not necessarily locators that provide the
>> address of a resource; they are often merely identifiers that provide a
>> unique ID for a given resource. IRIs are a generalization of this
>> because..."
>
> Changed as suggested, with slightly different wording:
>
> [[
>     The URLs (Uniform Resource Locators) that
>     people use as Web addresses are one form of IRI. Other forms of IRI
>     provide an identifier for a resource without implying its location
>     or how to access it.
> ]]
>
>>
>> 3.2 "RDF is agnostic about what the IRI stands for" Unlike section 5.1
>> ("in this example foaf:Person stands for
>> <http://xmlns.com/foaf/0.1/Person>") I think that "stands for" is not
>> appropriate here. (After all, IRI stands for "International Resource
>> Identifier.")  "Represents" or "identifies" would be better.
>
> Changed to "represents".
>
>> 3.4 I don't think algebra variables are a very good analogy here. Those
>> are named things that may not have values, and blank nodes are unnamed
>> things that do have values.
>
> Hmm, I think the notion of variable actually comes close to the 
> intuition about blank nodes. I prefer to leave it in, unless we have a 
> better analogy.
>
>> Section 3.4 overall is a little too brief and abstract for an RDF
>> neophyte. Blank nodes are a difficult concept for people who are new to
>> RDF. Either don't cover them in the Primer or cover them a bit more. For
>> example, this section would greatly benefit from a new diagram similar
>> to the one in Figure 1 that includes the cypress tree.
>
> I've added this as a potential todo to the document.
>
>> Also: 'Resources such as the unidentified cypress tree are called "blank
>> nodes" in RDF.' The resource (the tree, in this case) is not called a
>> blank node. How about this: 'Resources without identifiers such as the
>> painting's cypress tree can be represented by "blank nodes" in RDF.'
>
> Changed as suggested.
>
>> 3.5 "does not specify a particular semantics" That's normative
>> spec-speak, not primer-speak, and should be reworded to be clearer to
>> beginners. A bit later, the "i.e." parenthetical remark after "RDF
>> provides no way to convey this semantic assumption" provides a good
>> model of connecting this high-level talk of semantics to the actual data
>> being discussed.
>
> Sentence dropped. Indeed, the ramarks later make this point in a 
> clearerr way.
>
>> Section 1 said that "For example retrieving http://www.example.org/bob
>> could provide data about Bob," leading me to believe that this URI
>> represented the resource Bob. In the section on named graphs, the same
>> URI represents a named graph, not a person. I understand that this
>> doesn't invalidate the "For example" sentence--if it's the name of a
>> graph, retrieving it could still "provide data about Bob"--but I think
>> this can still confuse the RDF beginner, and recommend that the examples
>> in the section on named graphs use new IRIs that have not appeared in
>> the Primer before.
>
> Another reviewer suggested to change the IRI in Sec. 1. to 
> http://www.example.org/bob/Bob#me.  I assume this also addresses your 
> remark. Changed accordingly.
>
>> "In the example default (unnamed) graph below we see two triples that
>> have a graph name as subject:" Insert a sentence before this about why
>> someone would want to do this, e.g. "When you can reference a graph with
>> a IRI, you can create triples that provide metadata about that graph."
>
> The current text is a compromise, as the RDF group didn't define 
> semantics for triples in which graph names occur. Therefore, the 
> explanation about what the example triples stand for is at the end of 
> the paragraph following the example. I hope you think this is clear 
> enough.
>
>> "subsets of triples" doesn't make sense. "subsets of a dataset [ or
>> collection] of triples"?
>
> Changed to "subsets of a collection of triples"
>
>> 4. "For example, one can state that the IRI ex:friendOf can be used as a
>> property" the idea of this being an IRI will come as a complete surprise
>> to the reader, because the use of prefixes hasn't been discussed at all
>> yet. (Is a qname considered an IRI?) The original RDF Primer at
>> http://www.w3.org/TR/rdf-primer/ has a good paragraph beginning "The
>> full triples notation requires" that introduces this well. However it's
>> done, as a Primer this should explain any new syntax, such as the use of
>> namespace prefixes, before using that syntax.
>
> Oops, that was unintended. We prefer not to introduce qnames here. 
> Changed to: "http://www.example.org/friendOf".
>
>> "domain respectively range restrictions" domain and range restrictions,
>> respectively (The sentence with this is another example of assuming a
>> pre-existing, strong understanding of the relevant technical vocabulary
>> by the reader; the Primer really should have a few more sentences to
>> explain the use of rdfs:domain and rdfs:range, which is always a
>> difficult point with RDF beginners.)
>
> Included a sentence linking to the earlier friendOf/Person example.
>
>> After Example 2 add something like this, because the idea of (and value
>> of!) properties as subjects or objects in triples has not been covered
>> at all up to this point and often comes as a surprise to people with an
>> object-oriented background: "Note that, while <is a friend of> is a
>> property typically used as the predicate of a triple (as it was in
>> Example 1), properties like this are themselves resources that can be
>> described by triples or provide values in the descriptions of other
>> resources. In this example, <is a friend of> is the subject of triples
>> that assign type, domain, and range values to it, and it's the object of
>> a triple that describes something about the <is a good friend of>
>> property."
>
> Added as suggested.
>
>> "RDFa (for HTML embedding)" I always think it's a shame that people
>> think that RDFa is only for use with HTML. It can be very useful with
>> other kinds of XML as well; see
>> http://www.devx.com/semantic/Article/42543 . I would love to see the
>> several references to this say "for HTML and XML embedding."
>
> Included.
>
>> Section 5.1 is more like a quick reference of Turtle syntax than a
>> Primer, because it covers so much so quickly. Readers who are new to RDF
>> (the intended audience of this document) will find it confusing. A brief
>> introduction to N-Triples before the Turtle part would make the Turtle
>> part much easier to understand, because then the reader will understand
>> that the use of angle brackets around full IRIs, quotes around literals,
>> and a period after each triple are the most important parts of the
>> syntax and that everything else in Turtle is just a syntactical
>> convenience.
>
> Added a B-Triples-conformant example to the beginning of the section, 
> plus text to explain this basic form. Also placed a note right after 
> the graph figure to point readers to the N-Triples examples.
>
>> "the predicate-object part of triples with <http://example.org/bob#me>
>> as subject"  the predicate-object part of triples that have
>> <http://example.org/bob#me> as their subject
>
> Changed.
>
>> "The semicolons at the end of lines 9-11 indicate that the set is not
>> yet complete. A period is used to signal the end of a Turtle statement."
>> The use of "set" here is confusing. Set of what? I know that it refers
>> to predicate-object pairs associated with a common subject, but someone
>> new to Turtle might think that it's some specific Turtle construct. I
>> think it would be better to say "The semicolons at the end of lines 9-11
>> each indicate the the predicate-object pair that follows them is part of
>> a new triple that uses the most recent subject shown in the data--in
>> this case, <bob#me>."
>
> Changed as suggested.
>
>> 'The term _:x is a blank node. It represents some unnamed tree depicted
>> in the Mona Lisa painting and belonging to the "Cypress" class.'  The
>> term _:x is a blank node. It represents an unnamed resource depicted in
>> the Mona Lisa painting that is an instance of the "Cypress" class. [It's
>> safer to say that it represents a resource, not a tree, and the idea of
>> "belonging" here is not quite accurate.]
>
> Changed as suggested.
>
>
>> === copyediting ===
>>
>> There are several places where "for example" should have a comma after
>> it: "For example retrieving", "For example a dataset about paintings",
>> "For example 'Léonard de Vinci'",
>>
>> 3.1 " <subject>  <predicate> <object>" has an extra space after 
>> <subject>
>>
>> - "multiple triples, for example" [em dash not comma]
>>
>> "allow writing literals" allow writing of literals
>>
>> "markup webpages": "mark up" should be two words when used as a verb.
>> I'd say "web pages" as two words as well.
>>
>> "Library of Congress published its..." The Library of Congress published
>> its
>>
>> The phrase "Using the Web Ontology Language" would make sense, but
>> "Using the OWL" in section 4 does not. Just say "Using OWL."
>>
>> "a RDF vocabulary" an RDF vocabulary
>>
>> "the reader is referred to the Turtle document" see the Turtle document
>>
>> " the reader can find for each RDF syntax corresponding"  the reader can
>> find, for each RDF syntax, corresponding
>>
>> "cypress-tree" cypress tree
>>
>> "cater for" cater to [although that could just be a British vs. American
>> usage thing]
>
> Apparently it is "cater for" if you provide something like food and 
> "cater to" if you satisfy some desire. I guess an application is not a 
> food, so changed to "cater to" :).
>
>>
>> "semantics which is specified in the RDF" semantics which are specified
>> in the RDF
>
> "Semantics" is nowadays (and in this sentence) often used as a 
> singular. Possibly a form of language pollution, but making it plural 
> wouldn't work here.
>
>> "Wikidata, a free, collaborative..." end that bullet point with a period
>> like the other bullets in that list.
>
> Thanks, these comments were *extremely* helpful. You can check the 
> changes in the new editor's draft [1]. Not that this draft is likely 
> to change over the next couple of days, as we are also including 
> comments from other people.
>
> Regards,
> Guus Schreiber
>
> [1]
>
>>
>>
>> Thanks,
>>
>> Bob DuCharme
>>
Received on Wednesday, 29 January 2014 18:17:16 UTC