Re: RDF/XML Syntax Revised WD for review from Jeremy Carroll on 2001-12-16 (w3c-rdfcore-wg@w3.org from December 2001)

From: Jeremy Carroll <jjc@hplb.hpl.hp.com>
Date: Sun, 16 Dec 2001 19:37:08 +0100
To: <w3c-rdfcore-wg@w3.org>
Message-ID: <MABBLGKMPIJFCKFGDBEPEEIHCAAA.jjc@hplb.hpl.hp.com>
My comments on the body of the text (not the appendices)


First - great work, a lot done over last time.


I'll section my comments as BUGS (7), TYPOS (2), PHRASING (5), SUGGESTIONS
(4), CLARITY (1).

I'll  use ***** to separate issues.


I would suggest only attending to TYPOS in this release. (My view is that
fixing the bugs is likely to introduce new unknown ones). Although these are
the easier bugs: 1, 3, 4, 7.   2 could best be addressed for this release by
noting that the processing of bagID is incomplete in this WD. 5 and 6 are
not sufficiently serious to require attention, even if the bar is set higher
than the Connolly level.


BUGS
(things where I think you have made a mistake, not merely where we differ)

*********
1.

Section 5.5 "if there is a propertyAttr attribute [...] with a.URI =
rdf:li". This is illegal, as previously decided.

http://www.w3.org/2000/10/rdf-tests/rdfcore/rdf-containers-syntax-vs-schema/
error001.rdf

*********
2.
bagID Section 5.5 "Then for all statements generated above (except the
previous statement)"
 A trivial point and a non-trivial one.

2.1
  I prefer "(except the immediatly previous statement)"

2.2
  They are not reified with node n.
  The rules as written would do the following:

<rdf:Description rdf:bagID="foo" eg:a="a" eg:b="b" />

==>

_:a <eg:a> "a" .
_:a <eg:b> "b" .
<#foo> <rdf:type> <rdf:Bag> .
<#foo> <rdf:subject> _:a .
<#foo> <rdf:predicate> <eg:a> .
<#foo> <rdf:object> "a" .
<#foo> <rdf:type> <rdf:Statement> .
<#foo> <rdf:subject> _:a .
<#foo> <rdf:predicate> <eg:b> .
<#foo> <rdf:object> "b" .
<#foo> <rdf:type> <rdf:Statement> .

   The following are the underlying errors:
    the node n which you pass to 5.26 is used as the identifier for the
reification quad, but you have created it as a bag in which to collect the
various reification quads.
    you have not specified anywhere the fact that rdf:_NNN arcs are used to
add the reifications to the bag.
    you appear not to have decided whether the following example creates one
reification or two
    <rdf:Description rdf:bagID="foo">
      <rdf:value rdf:ID="bar />
    </rdf:Description>

   Approaches I have considered for correctly specifying this typically use:
   + a bagLiCounter that behaves similarly to your liCounter
   + a more complex passing of information up or down the parse tree.
      in ARP
         the current subject has a property being the bag of reifications
which defaults to null. Whenever a triple is created the bag is considered,
and the code branches if it is non-null, specifically catching the case
above, where the reification that it was going to produce anyway is added to
the Bag.
      in Snail
         the reification rules are blocked when there is a bagID, and the
bagID is explicitly processed first. propertyElt productions are then marked
with a reification attribute, corresponding to rdf:ID.
      in another approach (not documented)
         every property element production produces its reification, if a
label is provided with an rdf:ID attribute then it is used. The nodeElement
production then discards the reifications which have blank node roots,
unless there is a bagID attribute.

Your treatment of bagID on emptyPropertyElt is similarly defective.
***********
3.

5.9 resourcePropertyElt

In the rule:
  "nodeElement" ==> "ws* nodeElement ws*"

************
4.

5.12 parseTypeResourcePropertyElt
Your text
"start_element(URI=rdf:Description,
    subject=n,
    attributes=set(bagIdAttr=a)
c
end_element()"

==>
Better text
"start_element(URI=rdf:Description,
    subject=n,
    attributes=set()
c
end_element() "

This has nothing to do with bagID.


************
5.

5.14 emptyPropertyElt
The text doesn't cover the case with bagID and optional rdf:ID.
e.g.
  <rdf:Description rdf:bagID="foo" />
or
  <rdf:Description rdf:bagID="foo" rdf:ID="bar" />
If you want to follow M&S para 232 then the the rdf:ID is a reification :-).
M&S doesn't specify what the bagID means in this case either. ARP takes it
to introduce a new node which is an empty Bag.

*************
6.

5.15 idAttr
5.17 bagIDAttr
No statement about unicity of ID. Any such statement needs to be clear that
both come from the same space.


************
7.

5.18 propertyAttr
rdf:li should be in the list of exclusions.
Personally I would also exclude:
  rdf:subject rdf:predicate
  rdf:Seq rdf:Bag rdf:Alt rdf:Statement rdf:Property




TYPOS
*****

8.
"precisely called an directed" => "precisely called a directed"

9.
"properties and form for skipping" => properties and for skipping"

PHRASING, ISSUES
****************
(aka minor points with no suggestions)

10.
I don't like Dan's "defines RDF as a graph" - I think this is a discussion
topic at some point. For me, and I think for Pat, the graph is the primary
syntax for RDF. Certainly the current text will do for now.

************
11.
"turns sequences of Node, Arc," => "turns paths in the graph of the form
Node, Arc,"  (I prefer the more technical graph theoretic language - not
important).


***********
12.
Section "4.3 Notation Forms" does not print well.


*************
13.
Do we need to say anything about dialects such as RSS or PRISM that do not
require the full RDF/XML syntax or a triple based approach to processing
RDF. i.e. at some level any processing that does not violate the model
theory is cool!

**************
14.

In section 5.1 "the grammar may be entered several times", ARP currently
remembers the ID state and prohibits the reuse of an ID across reinvocations
in the same document. What do you think? I have no axe here.

e.g. ARP gives an error with all of the following:


<foo xmlns:rdf="...">
   <rdf:RDF>
     <rdf:Description rdf:ID="a"/>
   </rdf:RDF>
   <rdf:RDF>
     <rdf:Description rdf:ID="a"/>
   </rdf:RDF>
</foo>


and


<foo xmlns:rdf="...">
   <rdf:RDF>
     <rdf:Description rdf:ID="a"/>
   </rdf:RDF>
   <rdf:RDF>
     <rdf:Description rdf:bagID="a"/>
   </rdf:RDF>
</foo>



and


<foo xmlns:rdf="...">
   <rdf:RDF>
     <rdf:Description rdf:ID="a"/>
   </rdf:RDF>
   <rdf:RDF>
     <rdf:Description>
       <rdf:value rdf:ID="a"/>
     </rdf:Description>
   </rdf:RDF>
</foo>



SUGGESTIONS
***********

15.
Every use of production "ws" is currently "ws*".
Since "S" already is 1 or more space characters, these uses of "ws" could be
"ws?". My preference would be to define "ws" as "S?" and then use "ws"
undecorated in the relevant productions (propertyEltList etc.)

****************

16.
I think the document would benefit from an introduction to the processing
model used in the triple production rules e.g.
[[[
This document describes the relationship between an RDF/XML document
and an N-triple document by describing a particular processing of the
XML Infoset. The N-triple document then naturally describes an RDF graph,
(as in Model Theory). This particular processing model is illustrative and
non-normative; any other processing model resulting in the same RDF graph
may be used.

In particular:
+ triples may be generated in any order
+ duplicates may be eliminated at any point
+ there is no requirement on RDF processors to support N-triples in any way

The processing model converts the XML Infoset into a closely related set
of Information Items, and then processes these Information Items using
declarative "grammar" rules and procedural rules generating triples that
are added to the N-triple output.
]]]


************

17.
The phrase "generate a local identifier" occurs a few times. It may be
clearer to add a subsection at the end of section 5 which clarifies that
these identifiers are new on each invocation and unique with file scope.
This would behave similarly to the List Expansion Rules or the Reification
Rules as a function in the processing model.


****************

18.
How about dropping parseTypeOtherPropertyElt?
In M&S the corresponding section reads to me as here is how to handle bad
input. As a rule we are trying to drop any such suggestions.

CLARITY
*******

19.
I believe you have implied that comments and processing instructions get
stripped before computing the string-value of a Text Node. This is however
unclear.



Jeremy
Received on Sunday, 16 December 2001 13:28:58 UTC