- From: Bob DuCharme <bob@snee.com>
- Date: Mon, 30 Dec 2013 18:25:33 -0500
- To: public-rdf-comments@w3.org
There's a lot of good stuff in it, but because it's a Primer, I assume
that its intended audience is people who are new to RDF, and the
document often assumes too much about the reader's knowledge of
technical specification vocabulary.
I've divided up my comments into two lists: comments about substance
followed by picky copyediting suggestions. Suggestions often show a
quoted phrase from the Primer followed by a suggested revision. For
example,
"cypress-tree" cypress tree
is a suggestion to replace "cypress-tree" with "cypress tree".
=== substantive (to varying degrees) ===
Section 1. says "The Resource Description Framework (RDF) is a framework
for describing info about resources in the World Wide Web." 1.1 says
that "An IRI identifies a web resource" and then references
http://www.ietf.org/rfc/rfc3987.txt, but I couldn't find anything in
that RFC about IRIs being limited to the identification of web
resources. I know that URLs define web resources, but if I assign an IRI
to the chair I'm sitting in, couldn't I use RDF to state facts about the
chair's location, manufacturer, etc., without this having anything to do
with the web? Or am I misunderstanding something? I always thought that
we could assign IRIs to absolutely anything and then use RDF to describe
them; limiting its use to web-based resources really limits its power.
3.1 "Resources typically occur in multiple triples, for example Bob and
the Mona Lisa painting in the examples above." The Mona Lisa resource
only occurs in one triple above this sentence, not two, unless you want
the reader to assume case insensitivity in the sample data, which I
think is a bad idea. I would capitalize <the Mona Lisa> consistently and
then explicitly point out how the same resource can appear in the
subject of one triple and the object of another, which is a new idea at
this point of the Primer. (A wonderful new idea!) After normalizing the
capitalization, the sentence might be better off like this: "The same
resource is often referenced in multiple triples. In the example above,
Bob is the subject of four triples, and the Mona Lisa is the subject of
one and the object of another. This ability to have the same resource be
in the subject position of one triple and the object position of another
makes it possible to find connections between triples, which is an
important part of RDF's power. We can therefore visualise triples as..."
"The example above... an RDF graph" Move that paragraph before the
"Resources typically" paragraph, i.e. right after the example itself,
maybe in one of the green "NOTE" blocks.
In the note that begins
The RDF Data Model is described in this section in the form of an
"abstract syntax"
do "encoding" and "concrete RDF syntax" refer to the same thing? If so,
make that clearer. I think it would be better off to never use the word
"encoding," which people are more likely to associate with things like
UTF-8 vs. Latin 1, and instead use the term "concrete syntax"
consistently. The first time the Primer uses the phrase "concrete
syntax," a parenthesized phrase after it could say something like "(the
syntax used to represent triples stored in text files)", because as a
Primer this should provide more hints about the meaning of highly
technical phrases. These same issues come up in the paragraph of Section
5 beginning "Many different concrete syntaxes..."
"three types of RDF data that occur in triples" three types of RDF
resources that occur in triples
"The notion of IRI is a generalization of URI (Uniform Resource
Identifier)" To assume that someone who doesn't understand RDF (the
intended audience of the Primer) understands what URIs are and their
relationship to URLs is a huge, huge assumption. How about adding,
after the sentence with this, something like "The URLs (Uniform Resource
Locators) that people use as web addresses are one form of URI, with an
important difference: URIs are not necessarily locators that provide the
address of a resource; they are often merely identifiers that provide a
unique ID for a given resource. IRIs are a generalization of this
because..."
3.2 "RDF is agnostic about what the IRI stands for" Unlike section 5.1
("in this example foaf:Person stands for
<http://xmlns.com/foaf/0.1/Person>") I think that "stands for" is not
appropriate here. (After all, IRI stands for "International Resource
Identifier.") "Represents" or "identifies" would be better.
3.4 I don't think algebra variables are a very good analogy here. Those
are named things that may not have values, and blank nodes are unnamed
things that do have values.
Section 3.4 overall is a little too brief and abstract for an RDF
neophyte. Blank nodes are a difficult concept for people who are new to
RDF. Either don't cover them in the Primer or cover them a bit more. For
example, this section would greatly benefit from a new diagram similar
to the one in Figure 1 that includes the cypress tree.
Also: 'Resources such as the unidentified cypress tree are called "blank
nodes" in RDF.' The resource (the tree, in this case) is not called a
blank node. How about this: 'Resources without identifiers such as the
painting's cypress tree can be represented by "blank nodes" in RDF.'
3.5 "does not specify a particular semantics" That's normative
spec-speak, not primer-speak, and should be reworded to be clearer to
beginners. A bit later, the "i.e." parenthetical remark after "RDF
provides no way to convey this semantic assumption" provides a good
model of connecting this high-level talk of semantics to the actual data
being discussed.
Section 1 said that "For example retrieving http://www.example.org/bob
could provide data about Bob," leading me to believe that this URI
represented the resource Bob. In the section on named graphs, the same
URI represents a named graph, not a person. I understand that this
doesn't invalidate the "For example" sentence--if it's the name of a
graph, retrieving it could still "provide data about Bob"--but I think
this can still confuse the RDF beginner, and recommend that the examples
in the section on named graphs use new IRIs that have not appeared in
the Primer before.
"In the example default (unnamed) graph below we see two triples that
have a graph name as subject:" Insert a sentence before this about why
someone would want to do this, e.g. "When you can reference a graph with
a IRI, you can create triples that provide metadata about that graph."
"subsets of triples" doesn't make sense. "subsets of a dataset [ or
collection] of triples"?
4. "For example, one can state that the IRI ex:friendOf can be used as a
property" the idea of this being an IRI will come as a complete surprise
to the reader, because the use of prefixes hasn't been discussed at all
yet. (Is a qname considered an IRI?) The original RDF Primer at
http://www.w3.org/TR/rdf-primer/ has a good paragraph beginning "The
full triples notation requires" that introduces this well. However it's
done, as a Primer this should explain any new syntax, such as the use of
namespace prefixes, before using that syntax.
"domain respectively range restrictions" domain and range restrictions,
respectively (The sentence with this is another example of assuming a
pre-existing, strong understanding of the relevant technical vocabulary
by the reader; the Primer really should have a few more sentences to
explain the use of rdfs:domain and rdfs:range, which is always a
difficult point with RDF beginners.)
After Example 2 add something like this, because the idea of (and value
of!) properties as subjects or objects in triples has not been covered
at all up to this point and often comes as a surprise to people with an
object-oriented background: "Note that, while <is a friend of> is a
property typically used as the predicate of a triple (as it was in
Example 1), properties like this are themselves resources that can be
described by triples or provide values in the descriptions of other
resources. In this example, <is a friend of> is the subject of triples
that assign type, domain, and range values to it, and it's the object of
a triple that describes something about the <is a good friend of>
property."
"RDFa (for HTML embedding)" I always think it's a shame that people
think that RDFa is only for use with HTML. It can be very useful with
other kinds of XML as well; see
http://www.devx.com/semantic/Article/42543 . I would love to see the
several references to this say "for HTML and XML embedding."
Section 5.1 is more like a quick reference of Turtle syntax than a
Primer, because it covers so much so quickly. Readers who are new to RDF
(the intended audience of this document) will find it confusing. A brief
introduction to N-Triples before the Turtle part would make the Turtle
part much easier to understand, because then the reader will understand
that the use of angle brackets around full IRIs, quotes around literals,
and a period after each triple are the most important parts of the
syntax and that everything else in Turtle is just a syntactical convenience.
"the predicate-object part of triples with <http://example.org/bob#me>
as subject" the predicate-object part of triples that have
<http://example.org/bob#me> as their subject
"The semicolons at the end of lines 9-11 indicate that the set is not
yet complete. A period is used to signal the end of a Turtle statement."
The use of "set" here is confusing. Set of what? I know that it refers
to predicate-object pairs associated with a common subject, but someone
new to Turtle might think that it's some specific Turtle construct. I
think it would be better to say "The semicolons at the end of lines 9-11
each indicate the the predicate-object pair that follows them is part of
a new triple that uses the most recent subject shown in the data--in
this case, <bob#me>."
'The term _:x is a blank node. It represents some unnamed tree depicted
in the Mona Lisa painting and belonging to the "Cypress" class.' The
term _:x is a blank node. It represents an unnamed resource depicted in
the Mona Lisa painting that is an instance of the "Cypress" class. [It's
safer to say that it represents a resource, not a tree, and the idea of
"belonging" here is not quite accurate.]
=== copyediting ===
There are several places where "for example" should have a comma after
it: "For example retrieving", "For example a dataset about paintings",
"For example 'Léonard de Vinci'",
3.1 " <subject> <predicate> <object>" has an extra space after <subject>
- "multiple triples, for example" [em dash not comma]
"allow writing literals" allow writing of literals
"markup webpages": "mark up" should be two words when used as a verb.
I'd say "web pages" as two words as well.
"Library of Congress published its..." The Library of Congress published its
The phrase "Using the Web Ontology Language" would make sense, but
"Using the OWL" in section 4 does not. Just say "Using OWL."
"a RDF vocabulary" an RDF vocabulary
"the reader is referred to the Turtle document" see the Turtle document
" the reader can find for each RDF syntax corresponding" the reader can
find, for each RDF syntax, corresponding
"cypress-tree" cypress tree
"cater for" cater to [although that could just be a British vs. American
usage thing]
"semantics which is specified in the RDF" semantics which are specified
in the RDF
"Wikidata, a free, collaborative..." end that bullet point with a period
like the other bullets in that list.
Thanks,
Bob DuCharme
Received on Monday, 30 December 2013 23:25:40 UTC