RE: RDF Primer Typo

From: Markus Lanthaler <markus.lanthaler@gmx.net>
Date: Wed, 19 Feb 2014 13:07:06 +0100
To: "'Public RDF comments list'" <public-rdf-comments@w3.org>
Cc: "'Yves Raimond'" <yves.raimond@bbc.co.uk>, "'RDF WG'" <public-rdf-wg@w3.org>
Message-ID: <00e101cf2d6b$1d3b9b70$57b2d250$@lanthaler@gmx.net>
You might also have another look at all titles. I think the capitalization is inconsistent. Some examples:

Why *u*se RDF?
RDF Data Model  --> this should probably be "RDF's Data Model"
Blank *n*odes
Multiple *g*raphs
RDF *V*ocabularies
Writing RDF *g*raphs
Semantics of RDF *G*raphs
RDF *D*ata
More *I*nformation
Normative *r*eferences

Btw. Does it make sense to distinguish between normative and informative references in an (informative) note? I don't think so.

A few more remarks/suggestions since I'm already looking at the document and think it will be one of the most important documents to attract new users:

I wouldn't start the introduction with  a note which more or less repeats the abstract, I would instead add the reference to RDF11-NEW after the first "real" sentence.

Nitpicking but I don't like it when headers directly follow each other such as 3. RDF Data Model -> 3.1 Triples. You could add a simple sentence at the beginning of section 3 saying that RDF's underlying data model is a (directed) graph.

Section 3.1
Here are examples of RDF triples <Example> The example above does *not* constitute actual RDF syntax. Would find it better if the first sentence would say so already and we get rid of the note afterwards. Maybe something like "Triples allow us to express information such as:" and then set the title of the example to "Information expressed in the form of triples (pseudo-code, not actual RDF)" or something like that.

Concrete RDF syntax is introduced later in Sec. 5. -> syntaxes are!?

The reference to SPARQL is probably to early here. I doubt a reader will wonder how to query the data at this stage. I'd just remove it. It's already included again later on.

I noticed that you define quite a number of terms but never reference them. I was for example wondering if resource was already defined in section 3.1 and had to manually go back to the introduction to find that out. Especially for new comers I think it would be very helpful to explicitly cross-reference all these terms.

Section 3.2
I don't think it's necessary to highlight the IRIs so much by marking them as examples. Just add them to the text. Since they are blue and underlined, they are visible enough. If you want them to stick out even more, format them as <code> similar to how you did with dbpedia.org at the end of that section.

Section 3.3
Literals are basic values that are not IRIs -> neither are blank nodes!
Shouldn't "datatype" and "language tag" be definitions instead of just being formatted in italic?
Does the note in that section really belong in the primer? I don't care much but probably it should be removed. It's in RDF11-NEW.

Section 3.4
... without bothering to use an identifier. Hmm... think it would be better to say *global* identifier because we have blank node identifiers in concrete syntaxes.
"subject and object position" and similar things in previous sections shouldn't be formatted like a definition. Just transform "subject" and "object" to references, that suffices IMHO.

Section 3.5
"RDF provides a mechanism to group RDF statements in multiple graphs and associate each graph with an IRI" -> ... and associate graphs with an IRI (i.e., remove "each"; default graph) ??
"Multiple graphs in an RDF document constitute an RDF dataset" -> "A collection of multiple RDF graphs constitute an RDF dataset"
Why does "RDF dataset" link to Concepts whereas no other definition does?
For example, the statements in the first example -> in "Example 1"??
I would suggest to somehow include http://example.org/bob in example 2
The IRI associated with the graph is called the "graph name" -> graph name should be formatted as definition
Shall we really say that the graphs are *identified* with those IRIs??

Section 4
I would find it better to spell the variables like C, P etc out. So Class, Property, Instance or <a class>. Overall I have to say though that the table doesn't add much value and we wouldn't lose much by just removing it.
Instead, the example (in prose) from the second paragraph could be included in triple form (basically example 5, but using the same IRIs as in the prose)

List of vocabularies: Add Open Graph Protocol (http://ogp.me/), millions more people will know it than Dublin Core and SKOS. Thus, move schema.org and OGP to the beginning of that short list.

"the more vocabulary ITIs are reused" -> s/ITIs/IRIs/
s/so-called netwrok effect/so-called network effect/

Section 5.1
Could we make the line numbers gray or something (I found the color #999 works quite nicely in some of my other docs) to make it clearer that these are not part of the syntax.

Section 5.1.2
"Turtle introduces a number of syntax shortcuts" -> syntactic shortcuts??
Really? "support for namespaces"? Support for namespace prefixes, right?
I would remove the subsection "Representation of blank nodes", doesn't add much and isn't explained for the other syntaxes. You already say that "This section gives by no means a full account of the Turtle syntax".

Section 5.1.3
"The two triples specified on lines 30-32 are not part of any named graph" -> 27-29

Overall, I'm wondering if it really makes sense to explain the "Turtle family of RDF languages" in so much detail. The only question that the reader will ask himself is why are there four? How do I know to choose which? These questions are not answered at all IMO. I would like to shrink this section considerably.

The three pages (when printed) could probably easily be reduced to 1, max 1.5 pages without losing much. The only difference between N-Triples and N-Quads is the fourth component. That could be said in one sentence and be shown with a single example (default graph). Basically the same is true for TriG and Turtle.

Section 5.2
"JSON-LD also provides a way to serialize RDF datasets through the use of the @graph keyword." -> can be removed, this is already clear from the very first sentence
"Each JSON object corresponds to an RDF resource" -> JSON object *in the example above* (this is not generally true)
I don't really like the keyword aliasing in the example, but if you alias it, please use "url" instead of "uri". Also, please include the full absolute IRIs instead of bob#me, alice#me, and wd:Q12418. Rename "born" to "birthdate", "friends" to "knows" and get rid of the array (also for interest) and type-coerce "creator" to @id as well. So, in the end, the example should look as follows

01    {
02      "@context": "example-context.json",
03      "url": "http://example.org/bob#me",
04      "type": "Person",
05      "birthdate": "1990-07-04",
06      "knows": "http://example.org/alice#me",
07      "interest": {
08        "url": "http://www.wikidata.org/entity/Q12418",
09        "title": "Mona Lisa",
10        "subject_of": "http://data.europeana.eu/item/04802/243FA8618938F4117025F17A8B813C5F9AA4D619",
11        "creator": "http://dbpedia.org/resource/Leonardo_da_Vinci"
12      }
13    }

Unless someone thinks otherwise, I would prefer it this way though (to make it more recognizable as JSON-LD and to show the why there's a @type: @id in the context; keyword aliasing is really an advanced feature IMO):

01    {
02      "@context": "example-context.json",
03      "@id": "http://example.org/bob#me",
04      "@type": "Person",
05      "birthdate": "1990-07-04",
06      "knows": "http://example.org/alice#me",
07      "interest": {
08        "@id": "http://www.wikidata.org/entity/Q12418",
09        "title": "Mona Lisa",
10        "subject_of": "http://data.europeana.eu/item/04802/243FA8618938F4117025F17A8B813C5F9AA4D619",
11        "creator": "http://dbpedia.org/resource/Leonardo_da_Vinci"
12      }
13    }

"The @context key on line 2 points to a JSON document" -> to a *JSON-LD context document*

Since the context is so crucial for all of this, I think it is important to include it directly in this section. It would also be good to mention that it can be directly embedded into the document instead.
Here's an optimized version of the context (without the keyword aliases)

01    {
02      "@context": {
03        "foaf": "http://xmlns.com/foaf/0.1/",
04        "Person": "foaf:Person",
05        "interest": "foaf:topic_interest",
06        "knows": {
07          "@id": "foaf:knows",
08          "@type": "@id"
09        },
10        "birthdate": {
11          "@id": "http://schema.org/birthDate",
12          "@type": "http://www.w3.org/2001/XMLSchema#date"
13        },
14        "dcterms": "http://purl.org/dc/terms/",
15        "title": "dcterms:title",
16        "creator": "dcterms:creator",
17        "subject_of": {
18          "@reverse": "dcterms:subject",
19          "@type": "@id"
20        }
21      }
22    }

Section 7
A large amount of RDF data is available as part of the Linked Data [LINKED-DATA] cloud. -> the reference doesn't help here. Either remove it or link to http://lod-cloud.net/

The paragraph and example of sameAs doesn't belong here IMHO. Either move it to the previous section (Semantics) or simply remove it.

Section 8
Could be remove completely IMO.

Appendix B
Could be removed as well or factored into Acknowledgements

Appendix C
I don't really know what to do with this. Since most of the material has been moved to section 5 it might make sense to move RDFa and RDF/XML there as well. Or is there a specific reason why the have been banned to the appendix? :-) I think we can get rid of the JSON-LD examples there. The JSON-LD spec explains this already (well-enough I think).

Finally, just as the figures, I think all examples should have a title.

I understand that these are a lot of changes and that we are running out of time but I find it *very* important to make this document as good as we can (it's quite good already btw.). If you need help, just tell me. I'm more than happy to help editing the document.

Thanks for the great work,

Markus Lanthaler
