RDF/XML Syntax Revised WD comments from Dave Beckett on 2001-12-14 (w3c-rdfcore-wg@w3.org from December 2001)

From: Dave Beckett <dave.beckett@bristol.ac.uk>
Date: Fri, 14 Dec 2001 14:33:40 +0000
To: w3c-rdfcore-wg@w3.org
Message-ID: <30331.1008340420@tatooine.ilrt.bris.ac.uk>
In reply to my message
  http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2001Dec/0075.html

Summary:
  Various typos (Dan, Jos, Patrick): accepted, details below
  Reification (Jos): staying until RDF Core WG decides otherwise
  General comments (Patrick): see below, some potential ACTIONs here

Typo changes are in CVS version 1.134  at
  http://ilrt.org/discovery/2001/07/rdf-syntax-grammar/

(CVS http://cvs.ilrt.org/cvsweb/redland/rdfcore/syntax/index.html )


At this time I've got reviews from

  Dan Brickley
  http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2001Dec/0077.html

  Jos De Roo
  http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2001Dec/0095.html

  Patrick Stickler
  http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2001Dec/0096.html


I've replied to each comment below, where it isn't a simple typo.

Dave

----

General comments

  (Jos)
  I would propose to drop reification!
  so proposal to drop its sentences in 5.5, 5.9, 5.10, 5.11, 5.12 and 5.14
  as well as "5.26 Reification Rules"

I can't do this yet until we make a decision on reification.  Until
then, the rules to generate the right N-Triples should stay.  What
they *mean* isn't for this document (and isn't yet in the Model Theory).


  (Patrick)
  1. Would it be good to add somewhere in section 2 para 2 a footnote or 
  reference to some other section which warns users of RDF about the
  potential for unintended URI collisions based on direct concatenation
  of namespace to local name (loosing the partition) so that folks are
  aware of that issue and can avoid it when crafting qnames. Section 6 
  hints at some of these issues, but should they be more explicit? 
  Particularly with regards to possible URI collisions? 

A section describing the RDF/XML URI/qname generation rule would be a
good idea.  I could also put the ID value to URI rule there.  A note
that there may be several namespace URI/local name combinations for
the same URI would be a good idea.

Potential ACTION for this draft: Add a short section about this

  (Patrick)
  2. Section 2 para 3, it would probably help users understand the striping
  structure of the serialization if there was a very simple graphic.

Yes, I intended to create examples with pictures based on Dan's
document but had no time.  Something along the lines of a picture

  [this document] -> [has a title] -> "RDF/XML Syntax Revised"
                  -> [has an author] -> [] -> [has a family name] -> "Beckett"
                                           -> [has a home page] -> [...]

and then expand the example to show the striping, and talk about the
blank node in the graph, generate the XML

starting from the stripy
  <rdf:Description>
    <dc:creator>
      <rdf:Description>
        <ex:homePage rdf:resource="http:/.../" />
      </rdf:Description>
   </dc:creator>>
  <rdf:Description>

and filling out to:

  <?xml version="1.0"?>
  <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:ex="http://example.org/stuff/1.0/">
    <rdf:Description rdf:about="">
      <dc:title>RDF/XML Syntax Revised</dc:title>
      <dc:creator>
	<rdf:Description>
	  <ex:fullName>Dave Beckett</ex:fullName>
	  <ex:homePage rdf:resource="http:/.../" />
	</rdf:Description>
     </dc:creator>>
    <rdf:Description>
  </rdf:RDF>

or something.

Actually I've just noticed that the RDF/XML syntax document contains
no XML!  This is funny.

Potential ACTION for this draft: Add at least 1 XML example


  (Patrick)
  3. Section 2 para 5, you may want to change "property value is a string"
  to something like "property value is represented as a string" to avoid
  any incompatabilities with the eventual data typing clarifications. I
  think we all are at least in agreement that values denoted by serialized 
  literals are represented as a string, whether they are or are not a string.

How about when a "property value is a literal string"

  (Patrick)
  6. Section 3.6 para 1, you need to define what a "bNode" is in relation
  to a bnodeID. Since "bNodes" are not in the orig specs, folks will not
  know what they are (unless they've been lurking about the mailing lists).

  Question: Should we have a shared glossary of terms for all RDF specs? If
  so, we could then simply reference terms in general documents and not have
  to repeatedly define them or risk a loss of reader understanding if not
  defined.

I only every use "blank nodes" when talking about the graph (refering
to the model theory) or "bnodeIDs" when talking about emitting
N-Triple _:foo things.  This could be aded to the section about URIs
and IDs, I suggest above

Potential ACTION: Add this to the URIs/IDs section

  (Patrick)
  7. Section 5.5 bullet list 1, since you define the namespace for RDF and
  the terms within that namespace sans any prefix, I think it is reasonable
  to omit the prefixes everywhere those terms are referenced, since 'rdf:'
  is not mandated by the spec and (it appears that) it is not necessary to
  differentiate those terms from other namespaces. It is also probably
  misleading to state conditions in terms of undefined prefixes. If the
  prefix is manditory in those examples, then there should be a clause 
  somewhere saying that in this document, the prefix 'rdf:' will be used
  to refer to the RDF namespace.

The RDF namespace is defined in 4.4 and the (XML) prefix you mention
is not defined in there.  The notation I'm using is just so that in
the document we don't confuse, say rdf:subject (the term subject from
the RDF namespace) and the bare word subject and so on for other
words.

I don't understand: "It is also probably misleading to state
conditions in terms of undefined prefixes."  It is defined.

However I have added a new paragraph to 4.4 to define this
rdf:... notation for the entire document, although it was also
introduced in the table in 4.2.

  (Patrick)
  8. Section 5.12, the generation of "local" identifiers for bNodes seems 
  to take a per-instance scope view, yet these productions are for producing
  an RDF graph which may have content from multiple instances. Are we
  speaking only of generating a single graph from a single instance, which
  must later be merged into a larger graph constituting the combined 
  knowledge space? If so, then we're OK with this scope, but it perhaps
  should be clarified that what we are generating is an instance specific
  graph and merging issues are not herein addressed. If however we are
  talking about a graph that should be mergeable with any other RDF graph
  without any considerations beyond tidying of nodes with equivalent URI
  labels, then we have the issue of creating bNode identifiers which are
  unique for that broader scope, which may in fact span more than one
  system. In such a case, would it be too presumptuous to recommend
  (though not mandate) GUIDs or UUIDs as bnodeID values?

This document defines how to turn an RDF/XML document into a sequence
of N-Triples, forming an N-Triples document with local identifiers
for the blank nodes.  The scope of the blank node identifiers is the
N-Triples document, which generates an RDF graph.  I shall try to
explain further what these local (blank node) identifiers are and
their use.

The rest of your comments are more about the RDF graph and hence the
model theory.

We previously decided (at the last F2F) to not use globally generated
IDs for graph nodes with no URI; but create these blank node
identifiers, which are not URIs.


  (Patrick)
  10. Section 5.27, some discussion (possibly) should be provided for
  scenarios where there is a mingling of rdf:li and rdf:_n in the actual
  serialization. Should li-counter consider the explicit _n values?
  Should it ensure resume counting after the explicit _n value? Does
  the parser have to ensure no collisions or does it create a partial
  ordering? 

The resolution of this issue and the consequent test cases already
define this.  The rdf:li generation rule takes no notice of explit
rdf:_n value and works as I described it.

See http://www.w3.org/2000/03/rdf-tracking/#decisions
and http://www.w3.org/2000/10/rdf-tests/rdfcore/rdf-containers-syntax-vs-schema/

This *may* generate duplicate N-Triples in the serialisation, but
these have the same interpretation in the graph.

As far as I know we are also neutral on containers (nodes) that have
duplicate rdf:_n values, since somebody may find that useful.


  (Patrick)
  11. Section 6, should a recommended algorithm for generating qnames
  on re-serialization be provided? One simple means of achieving 
  reasonable round-tripping of at least qnames is to provide the
  RDF/XML writer with a set of namespace declarations to use and
  have it select for each URI the namespace which has the maximal 
  intersection with the URI and use that for deriving the qname. 

Lots of heuristics are possible including the above.  If that was
recommended, it also would be a good idea to recommend recording
these namespace URIs/prefixes during parsing too.  Maybe we could
improve this for the next draft


  (Patrick)
  12. Appendix A.2 rdfms-qname-uri-mapping, this issue also includes
  the potential for collision, not just "wrong" URIs.

This issue isn't decided; those words are just a suggestion.

  (Patrick)
  13. Appendix A.2 rdfms-difference-between-ID-and-about, should the 
  decisions about this be more explicit in the normative sections?

I was wondering about making the rdf:ID value to URI conversion more
explicit and might pull that out.


Typos: all accepted

  (Dan)
  "The RDF Model Theory ([RDF-MODEL]) is a graph consisting of nodes "

  ...the Model Theory itself isn't a graph; it's a theory. So we kind of
  want to say 'the RDF model is a', except we don't seem to use the word
  'model' that way anymore. Same goes for 'the RDF data model', I think.

  "The RDF Model Theory" [RDF-MODEL] defines RDF as a graph..." perhaps?

  (Jos)
  in "3 Data Model" last line
  ... part of the node ro computed from the string-value of contained nodes.
		       ^^or

  (Jos)
  in "3.6 Identifier Node" 2nd sentence
  ... These nodes are created by giving two values for the for the ...
							   ^^^^^^^

  (Jos)
  in "5.9 Production resourcePropertyElt" the N-Triples statement
  ... e.parent.subject.string-value <e.URI> n.subject.string-value; .

  (Patrick)
  4. Section 2 para 6, you may wish to change "replacing the rdf:Description 
  element" with "replacing the rdf:Description element name" as the element
  itself corresponds to an rdf:Description element and only the name change
  signifies the typing.

  (Patrick)
  5. Section 3, list item "Element Information Item", you're missing a
  square bracket after "local name".
                                                                  ^
  (Patrick)
  9. Section 5.26, the names subject, predicate, and object are not included
  in the listed terms included in the RDF namespace given in section 4.4, 
  where it presents an "exhaustive" list of terms.
Received on Friday, 14 December 2001 09:33:43 UTC