Comments on 17 December 2013 WD of RDF 1.1 Primer from Thomas Baker on 2014-01-08 (public-rdf-wg@w3.org from January 2014)

From: Thomas Baker <tom@tombaker.org>
Date: Tue, 7 Jan 2014 23:19:21 -0500
To: RDF Working Group <public-rdf-wg@w3.org>
Message-ID: <20140108041921.GA75062@julius.local>
The Primer [1] is taking shape nicely!  

Bob DuCharme and Antoine Isaac have already raised alot of excellent points
[2,3].  The comments below are divided into comments of substance, comments
specifically about the NOTEs, and copyediting suggestions.

I agree with Antoine that we should take the opportunity to help make it
perfect...! :-)

Tom

[1] http://lists.w3.org/Archives/Public/public-rdf-comments/2013Dec/0124.html
[2] https://dvcs.w3.org/hg/rdf/raw-file/default/rdf-primer/index.html#
[3] http://lists.w3.org/Archives/Public/public-rdf-comments/2014Jan/0006.html

======================================================================
Comments of substance

--  "The Resource Description Framework (RDF) is a framework for describing
    information about resources..."

    This definition uses the same three words as what it defines -- Resource,
    Description, and Framework.  How about:

        The Resource Description Framework (RDF) is a language for expressing
        information about things.

--  "...about resources in the World Wide Web"

    Do we want to project the message that RDF is really just about describing
    Web pages and videos?

    Paragraph three starts with "In particular, RDF can be used to publish and
    interlink data on the Web" -- and IMO paragraph three is the right place to
    make this connection.
    
    I suggest dropping the second half of the first sentence and substituting
    it with a sentence or two to the effect that RDF is a language for data
    which uses Web addresses as globally defined names for things and leverages
    those global names to enable data to be connected across a multitude of
    distributed, independently maintained data sources.  I could propose more
    polished wording if desired.

--  "framework" -- or "language"?

    I suspect the average reader will have no concept of "framework".  But if
    RDF were called a "language" in the first paragraph, it could also be
    called a language further down.  Specifically:

        RDF allows us to make statements about resources
    
    could be 

        RDF provides a language for making statements about resources

--  "An RDF statement represents a relationship between two resources."

    As the text goes on to say that the subject and object _represent_ the two
    resources being related and the predicate _represents_ the nature of their
    relationship, it seems more precise to say:

        An RDF statement states a relationship between two resources.

--  "Resources typically occur in multiple triples"

    This wording seems problematic because further down on that page, the text
    lists IRIs, literals, and blank nodes as things that "occur in triples".
    How about:

        Resources, such as "Bob" and "The Mona Lisa", are typically the subject
        or object of multiple triples.

--  "Informally speaking, RDF allows us to make statements of the form:"
    
    What is informal here is not the fact that RDF allows us to make three-part
    statements, but the informal syntax used to present them.  How about:

        RDF allows us to make three-part statements such as the following
        (expressed here, for readability, in pseudocode):

    Maybe there is a better word for it than pseudocode, though I think it
    sort of works.

--  Section 3.2 on IRIs in triples

    The text _says_ that IRIs can appear in all three positions, but it only
    provides example IRIs for a subject (The Mona Lisa) and an object
    (Leonardo).  Perhaps the section could go one step further and introduce
    the IRI for the property foaf:topic_interest (without waiting for Section
    5.1).  Then it could show the triple:

        <http://www.wikidata.org/entity/Q12418>
            <http://xmlns.com/foaf/0.1/topic_interest>
                <http://dbpedia.org/resource/Leonardo_da_Vinci>

--  Section 3.3 on Literals

    The use of literals could then be illustrated with:

        <http://www.wikidata.org/entity/Q12418>
            <http://purl.org/dc/terms/title>
                "Mona Lisa"

--  Section 3.4 on Blank Nodes

    I agree with Bob that this section is too brief and should either be
    dropped (please not!) or expanded, perhaps with a diagram.  

    If RDF were called a language from the start, then blank nodes could 
    be explained by analogy to subordinate clauses.  For example:

        Bob is interested in something which has the title "The Mona Lisa".

    Trying to express this in the pseudocode of section 3 seems inadequate:

        <Bob> <is interested in> <X>
        <X> <has the title> <The Mona Lisa>

    However, if the use of IRIs and literals in triples has just been
    illustrated in the previous two sections, one could posit, for the sake of
    argument that one does not know a URI for "Mona Lisa" and say:
        
        <http://example.org/bob#me>
            <http://xmlns.com/foaf/0.1/topic_interest>
                :blank_node_id1

        :blank_node_id1
            <http://purl.org/dc/terms/title>
                "Mona Lisa"
            
    The accompanying diagram could be an adaptation of Figure 2.

--  "For both classes and properties one can create subtype hierarchies".

    Read in the context of the previous sentence ("The relation between an
    instance and its class is modelled through the type property"), this could
    be taken to mean that classes and properties can be sub-classed.  How
    about:

        One can create create hierarchies of classes and sub-classes or of
        properties and sub-properties.

    Also (s/modelled/stated):

        The relation between an instance and its class is stated using the 
        type property.

--  "Type restrictions on the subjects and objects of particular triples can be
    defined through domain respectively range restrictions"

    This could be read as meaning that domain and range can be used to
    "restrict" values in a closed-world sense.  Also, domains and ranges are
    not defined for "particular triples" but for properties.  Maybe something 
    like (to be improved):

        The types of resources associated with a given property in the context
        of statements can be specified with a domain (for subjects) and range
        (for objects).

    It might be worth drawing this out a bit by emphasizing that domains and
    ranges are about making inferencing possible, if it could be done briefly
    and illustrated with a nice example.

======================================================================
The use of NOTEs
   
    The NOTE blocks make good points but at the cost of interrupting the flow
    of the text.  Calling out NOTEs as separate blocks has the effect of
    drawing attention to the sort of detail I'd expect to find in footnotes.
    Taking them note by note:

    "This primer is..."
   
        Maybe put in a separate, unnumbered section before the Introduction 
        called "About this document"?

    "An IRI is..." 

        The notion that RDF uses IRIs as names for things is so fundamental
        that it should be introduced in the first paragraph or two.  That
        explanation could already state the relationship of IRIs to URIs and
        URLs (as per section 3.2 and Bob's comments thereon).  Such an expanded
        explanation would replace this NOTE.

    "The RDF Data Model..."

        That the RDF Data Model is expressed with an abstract syntax which is
        independent of a particular [concrete syntax] is also a really key
        point.  The notions "abstract syntax" and "concrete syntax" could
        perhaps be defined in the Introduction by re-casting the list of
        normative specifications as a list of things provided by the suite of
        specs, e.g.:

            The normative specifications of RDF define:

            * The RDF Data Model, with an abstract syntax independent of any
              particular concrete syntax ("RDF Concepts and Abstract Syntax")
              [RDF11-CONCEPTS]

            * Formal model-theoretic semantics ("RDF Semantics")[RDF11-MT]

            * Several compatible concrete syntaxes -- different ways to
              record RDF data in files for processing by applications:

              ** Turtle...
              ** JSON-LD...
              ...

            * A data-modeling vocabulary, RDF Schema [RDF11-SCHEMA].

    "RDF is agnostic..."

        Drop as a NOTE and fold into the explanation of IRIs in the Introduction.

    "The RDF data model assigns the special datatype rdf:langString..."
    "The 2004 version of RDF contained the notion of a 'plain literal'..."

        Drop as NOTEs -- IMO these points are too detailed for the Primer.

    "The IRI associated with the graph..."
    "RDF provides no way to convey this semantic assumption..."
    "Multiple graphs are a recent extension of the RDF data model..."

        Drop as separate NOTEs and fold into the paragraph which starts with
        "RDF 1.1 doesn't prescribe any specific semantics for datasets".

    "The syntactic form... is in a prefix notation..."

        Drop as a separate NOTE and fold into the explanation of IRIs in the
        Introduction.

    The remaining notes could similarly be folded into the text.  If the
    content of the notes is too important to drop, perhaps the notes could all
    be collected at the end as end notes.

======================================================================
Copyediting suggestions

--  "Web" and "web": Both are used. I suggest "Web", but either way, usage should be consistent.

--  s/standard-compliant/standards-compliant/  (see http://en.wikipedia.org/wiki/Standards-compliant)

--  "The format of these statements is simple.  It always has the following form:"

    The use of "format" and "form" seems inconsistent, and the second sentence
    could perhaps simply be dropped, leaving just:

        The form of these statements is simple:

--  "visualise": does W3C still officially prefer American spelling ("visualize")?

--  "domain respectively range restrictions": This is an odd use of "respectively".  "Or"?

-- 
Tom Baker <tom@tombaker.org>
Received on Wednesday, 8 January 2014 04:19:55 UTC