A few comments on Primer (esp re Semantics) from Sandro Hawke on 2014-01-31 (public-rdf-wg@w3.org from January 2014)

From: Sandro Hawke <sandro@w3.org>
Date: Fri, 31 Jan 2014 15:32:25 -0500
CC: RDF WG <public-rdf-wg@w3.org>
Message-ID: <52EC0859.7000007@w3.org>
I recently had a chance to read through the Primer, and mostly it's 
great but there were a few things that bugged me.    Hopefully they're 
not to hard to fix.

1.   The use of the word "informative" in the first paragraph is a 
problem.   I don't think most people have any idea that in 
standards-speak "informative" has a different meaning than in normal 
English.  So to most people, that bit will just sound kind of dumb.   (I 
think it's a bad idea to ever use that word when we have a perfectly 
good alternative in "non-normative", but it's particularly problematic 
in the beginning of a primer.

I was thinking something like, "This document is a companion to a set of 
W3C standard, which are listed at the end of this introduction.  This 
document itself is not a standard, though."

2.  With my naive reader hat on, I was still feeling pretty confused at 
the end of 3.5, badly wanting a diagram.   Maybe move the one from later 
up to this point?   Not a show-stopper.

3.  Typo in 5.1, "habe"

4.  In 5.2 I think it's important to introduce N-Triples with saying 
it's a subset of Turtle.   That's the most important thing about it.

5.  In 5.2 I think we have a chance to push back against the biggest 
problem in RDF deployment.    Under RDF/XML I suggest:

  delete:  RDF/XML was the only normative syntax for RDF when RDF 1.0 
was published in 2004.

  add: When RDF was original developed in the late 1990s, this was its 
only syntax, and some people still call this syntax "RDF". In 2001, a 
precursor to Turtle called "N3" was proposed, and gradually the other 
syntaxes listed here have been adopted and standardized.

The main point is that for many years, all the way back to 1997 (I 
think, 1999 at least), it wasn't so much the "only normative syntax", it 
was the ONLY syntax.    .rdf files are RDF/XML. Professionals in this 
field still call RDF/XML "RDF".    We need to help newcomers understand 
this happens and what it means when it does.

6.  This is the hard one.   I was eagerly reading the document up to 
section 6. Semantics, just thinking like a programmer, and nodding in 
agreement as everything up to this point made perfect sense. Then I got 
hit with this stuff about "formal model-theoretic semantics" and 
"truth-preserving conditions", and it suddenly just seemed like 
handwaving and obscure "semantics" stuff I'd never care about.

I think this is a great place to explain to the RDF community WHY there 
are formal semantics and who might want to read rdf11-mt.   As the text 
is now I'm afraid it just feeds the feeling that rdf-mt is gobbledegook 
no one needs to pay attention to, unless they're working on a PhD.

Here's a strawman to show the kind of text I think we need:

    An overarching goal in the use of RDF is to be able to automatically
    merge useful information from multiple sources to form a larger
    collection that is still coherent and useful.   As a starting point
    for this merging, all the information is conveyed in the same simple
    style, subject-predicate-object triples, as described above.    To
    keep the information coherent, however, we need more than just a
    standard syntax; we also need agreement about the semantics of these
    triples.

    By this point in the Primer, the reader is likely to have an
    intuitive grasp of the semantics of RDF.  (1) The IRIs used to name
    the subject, predicate, and object are "global" in scope, naming the
    same thing each time they are used.  (2) Each triple is "true"
    exactly when the predicate relation actually exists between the
    subject and the predicate.  (3)  An RDF graph is "true" exactly when
    all the triples in it are "true".    These notions, and others, are
    specified with mathematical precision in the RDF Semantics document
    [RDF11-MT
    <https://dvcs.w3.org/hg/rdf/raw-file/default/rdf-primer/index.html#bib-RDF11-MT>].

    One of the benefits of RDF having these declarative semantics is
    that systems can make logical inferences.  That is, given a certain
    set of input triples which they accept as true, systems can in some
    circumstances deduce that other triples must, logically, also be
    true. We say the first set of triples "entails" the additional
    triples. These systems, called Reasoners, can also sometimes deduce
    that the given input triples contradict each other.

    Given the flexibility of RDF, where new vocabularies can be created
    when people want to use new concepts, there are many different kinds
    of reasoning one might want to do.  When a specific kind of
    reasoning seems to be useful in many different applications, it can
    be documented as an "entailment regimes". Several entailment regimes
    are specified in RDF Semantics.     For technical description of
    some other entailment regimes and how to use them with SPARQL, see
    SPARQL 1.1 Entailment Regimes
    http://www.w3.org/TR/sparql11-entailment/ .   Note that some
    entailment regimes are fairly easy to implement and reasoning can be
    done quickly, while others require a very sophistical techniques to
    implement efficiently.  Some entailment regimes have been proven to
    be intractable, but they might still be useful for small data sets.

    ... then go into the rdfs:domain example ...

I'm not attached to any of that wording -- I hope someone else can do 
better -- but hopefully you see how I'm trying to convey things people 
really need to know to operate in the RDF space without making a lot of 
assumptions about what they already know.   I think we have to do 
something like that.

With these changes, the document will be perfect.    :-)     Keep up the 
good work.

       -- Sandro
Received on Friday, 31 January 2014 20:32:26 UTC