- From: Bob DuCharme <bob@snee.com>
- Date: Mon, 30 Dec 2013 18:25:33 -0500
- To: public-rdf-comments@w3.org
There's a lot of good stuff in it, but because it's a Primer, I assume that its intended audience is people who are new to RDF, and the document often assumes too much about the reader's knowledge of technical specification vocabulary. I've divided up my comments into two lists: comments about substance followed by picky copyediting suggestions. Suggestions often show a quoted phrase from the Primer followed by a suggested revision. For example, "cypress-tree" cypress tree is a suggestion to replace "cypress-tree" with "cypress tree". === substantive (to varying degrees) === Section 1. says "The Resource Description Framework (RDF) is a framework for describing info about resources in the World Wide Web." 1.1 says that "An IRI identifies a web resource" and then references http://www.ietf.org/rfc/rfc3987.txt, but I couldn't find anything in that RFC about IRIs being limited to the identification of web resources. I know that URLs define web resources, but if I assign an IRI to the chair I'm sitting in, couldn't I use RDF to state facts about the chair's location, manufacturer, etc., without this having anything to do with the web? Or am I misunderstanding something? I always thought that we could assign IRIs to absolutely anything and then use RDF to describe them; limiting its use to web-based resources really limits its power. 3.1 "Resources typically occur in multiple triples, for example Bob and the Mona Lisa painting in the examples above." The Mona Lisa resource only occurs in one triple above this sentence, not two, unless you want the reader to assume case insensitivity in the sample data, which I think is a bad idea. I would capitalize <the Mona Lisa> consistently and then explicitly point out how the same resource can appear in the subject of one triple and the object of another, which is a new idea at this point of the Primer. (A wonderful new idea!) After normalizing the capitalization, the sentence might be better off like this: "The same resource is often referenced in multiple triples. In the example above, Bob is the subject of four triples, and the Mona Lisa is the subject of one and the object of another. This ability to have the same resource be in the subject position of one triple and the object position of another makes it possible to find connections between triples, which is an important part of RDF's power. We can therefore visualise triples as..." "The example above... an RDF graph" Move that paragraph before the "Resources typically" paragraph, i.e. right after the example itself, maybe in one of the green "NOTE" blocks. In the note that begins The RDF Data Model is described in this section in the form of an "abstract syntax" do "encoding" and "concrete RDF syntax" refer to the same thing? If so, make that clearer. I think it would be better off to never use the word "encoding," which people are more likely to associate with things like UTF-8 vs. Latin 1, and instead use the term "concrete syntax" consistently. The first time the Primer uses the phrase "concrete syntax," a parenthesized phrase after it could say something like "(the syntax used to represent triples stored in text files)", because as a Primer this should provide more hints about the meaning of highly technical phrases. These same issues come up in the paragraph of Section 5 beginning "Many different concrete syntaxes..." "three types of RDF data that occur in triples" three types of RDF resources that occur in triples "The notion of IRI is a generalization of URI (Uniform Resource Identifier)" To assume that someone who doesn't understand RDF (the intended audience of the Primer) understands what URIs are and their relationship to URLs is a huge, huge assumption. How about adding, after the sentence with this, something like "The URLs (Uniform Resource Locators) that people use as web addresses are one form of URI, with an important difference: URIs are not necessarily locators that provide the address of a resource; they are often merely identifiers that provide a unique ID for a given resource. IRIs are a generalization of this because..." 3.2 "RDF is agnostic about what the IRI stands for" Unlike section 5.1 ("in this example foaf:Person stands for <http://xmlns.com/foaf/0.1/Person>") I think that "stands for" is not appropriate here. (After all, IRI stands for "International Resource Identifier.") "Represents" or "identifies" would be better. 3.4 I don't think algebra variables are a very good analogy here. Those are named things that may not have values, and blank nodes are unnamed things that do have values. Section 3.4 overall is a little too brief and abstract for an RDF neophyte. Blank nodes are a difficult concept for people who are new to RDF. Either don't cover them in the Primer or cover them a bit more. For example, this section would greatly benefit from a new diagram similar to the one in Figure 1 that includes the cypress tree. Also: 'Resources such as the unidentified cypress tree are called "blank nodes" in RDF.' The resource (the tree, in this case) is not called a blank node. How about this: 'Resources without identifiers such as the painting's cypress tree can be represented by "blank nodes" in RDF.' 3.5 "does not specify a particular semantics" That's normative spec-speak, not primer-speak, and should be reworded to be clearer to beginners. A bit later, the "i.e." parenthetical remark after "RDF provides no way to convey this semantic assumption" provides a good model of connecting this high-level talk of semantics to the actual data being discussed. Section 1 said that "For example retrieving http://www.example.org/bob could provide data about Bob," leading me to believe that this URI represented the resource Bob. In the section on named graphs, the same URI represents a named graph, not a person. I understand that this doesn't invalidate the "For example" sentence--if it's the name of a graph, retrieving it could still "provide data about Bob"--but I think this can still confuse the RDF beginner, and recommend that the examples in the section on named graphs use new IRIs that have not appeared in the Primer before. "In the example default (unnamed) graph below we see two triples that have a graph name as subject:" Insert a sentence before this about why someone would want to do this, e.g. "When you can reference a graph with a IRI, you can create triples that provide metadata about that graph." "subsets of triples" doesn't make sense. "subsets of a dataset [ or collection] of triples"? 4. "For example, one can state that the IRI ex:friendOf can be used as a property" the idea of this being an IRI will come as a complete surprise to the reader, because the use of prefixes hasn't been discussed at all yet. (Is a qname considered an IRI?) The original RDF Primer at http://www.w3.org/TR/rdf-primer/ has a good paragraph beginning "The full triples notation requires" that introduces this well. However it's done, as a Primer this should explain any new syntax, such as the use of namespace prefixes, before using that syntax. "domain respectively range restrictions" domain and range restrictions, respectively (The sentence with this is another example of assuming a pre-existing, strong understanding of the relevant technical vocabulary by the reader; the Primer really should have a few more sentences to explain the use of rdfs:domain and rdfs:range, which is always a difficult point with RDF beginners.) After Example 2 add something like this, because the idea of (and value of!) properties as subjects or objects in triples has not been covered at all up to this point and often comes as a surprise to people with an object-oriented background: "Note that, while <is a friend of> is a property typically used as the predicate of a triple (as it was in Example 1), properties like this are themselves resources that can be described by triples or provide values in the descriptions of other resources. In this example, <is a friend of> is the subject of triples that assign type, domain, and range values to it, and it's the object of a triple that describes something about the <is a good friend of> property." "RDFa (for HTML embedding)" I always think it's a shame that people think that RDFa is only for use with HTML. It can be very useful with other kinds of XML as well; see http://www.devx.com/semantic/Article/42543 . I would love to see the several references to this say "for HTML and XML embedding." Section 5.1 is more like a quick reference of Turtle syntax than a Primer, because it covers so much so quickly. Readers who are new to RDF (the intended audience of this document) will find it confusing. A brief introduction to N-Triples before the Turtle part would make the Turtle part much easier to understand, because then the reader will understand that the use of angle brackets around full IRIs, quotes around literals, and a period after each triple are the most important parts of the syntax and that everything else in Turtle is just a syntactical convenience. "the predicate-object part of triples with <http://example.org/bob#me> as subject" the predicate-object part of triples that have <http://example.org/bob#me> as their subject "The semicolons at the end of lines 9-11 indicate that the set is not yet complete. A period is used to signal the end of a Turtle statement." The use of "set" here is confusing. Set of what? I know that it refers to predicate-object pairs associated with a common subject, but someone new to Turtle might think that it's some specific Turtle construct. I think it would be better to say "The semicolons at the end of lines 9-11 each indicate the the predicate-object pair that follows them is part of a new triple that uses the most recent subject shown in the data--in this case, <bob#me>." 'The term _:x is a blank node. It represents some unnamed tree depicted in the Mona Lisa painting and belonging to the "Cypress" class.' The term _:x is a blank node. It represents an unnamed resource depicted in the Mona Lisa painting that is an instance of the "Cypress" class. [It's safer to say that it represents a resource, not a tree, and the idea of "belonging" here is not quite accurate.] === copyediting === There are several places where "for example" should have a comma after it: "For example retrieving", "For example a dataset about paintings", "For example 'Léonard de Vinci'", 3.1 " <subject> <predicate> <object>" has an extra space after <subject> - "multiple triples, for example" [em dash not comma] "allow writing literals" allow writing of literals "markup webpages": "mark up" should be two words when used as a verb. I'd say "web pages" as two words as well. "Library of Congress published its..." The Library of Congress published its The phrase "Using the Web Ontology Language" would make sense, but "Using the OWL" in section 4 does not. Just say "Using OWL." "a RDF vocabulary" an RDF vocabulary "the reader is referred to the Turtle document" see the Turtle document " the reader can find for each RDF syntax corresponding" the reader can find, for each RDF syntax, corresponding "cypress-tree" cypress tree "cater for" cater to [although that could just be a British vs. American usage thing] "semantics which is specified in the RDF" semantics which are specified in the RDF "Wikidata, a free, collaborative..." end that bullet point with a period like the other bullets in that list. Thanks, Bob DuCharme
Received on Monday, 30 December 2013 23:25:40 UTC