review of current draft of 'RDFa syntax' from Diego Berrueta on 2008-01-29 (public-swd-wg@w3.org from January 2008)

From: Diego Berrueta <diego.berrueta@fundacionctic.org>
Date: Tue, 29 Jan 2008 21:11:38 +0100
To: SWD WG <public-swd-wg@w3.org>, public-rdf-in-xhtml-tf@w3.org
CC: ben@adida.net
Message-ID: <479F887A.80802@fundacionctic.org>
Hi all,

Please find below my review of the current Editor's Draft (25/Jan/2008)
of the "RDFa Syntax and Processing" [1]:

[1] http://www.w3.org/MarkUp/2008/ED-rdfa-syntax-20080125/

************

General comment:

The document is in very good shape. I found lots of improvements since
my last review. The document is packed with examples, which are very
welcome. Although I tried to spot any potential source of issues, I
think the most useful feedback will come from early implementors. I
couldn't review Appendix B because I'm not an expert on DTDs.

************

Substantial comments:

* Section 5.5, paragraph starting with "Processing begins with...". In
addition to elements containing one or more RDFa attributes, rules must
be applied _also_ to elements containing XML namespace declarations and
xml:lang attributes, even if they don't contain any RDFa attribute.
Otherwise, if some elements are skipped, rules 2 and 3 may fail to be
fired. My suggestion here is to simply say that every element should be
processed.

* Section 5.5: I can't find any rule to set a value for "parent bnode"
and "parent object". Rules 4 and 5 peek the value of these variables,
and Rule 6 clear their value, but it seems these variables are never
assigned a value. Maybe assignment should be done before Rule 10.

* Section 5.5, Rule 6, first subrule: after completing the incomplete
triples, the "list of incomplete triples" must be cleared.

* Section 5.5, Rule 9, subrule for plain literals: "... a string created
by concatenating the text content of each of the child elements of the
current element in document order ...". I think this is inconsistent
with the example that contains "<strong>Einstein</strong>" at section
6.3.1.3. My proposal is to replace "child" with "descendant" in the
quoted sentence. In XPath, descendant:: is the transitive closure of
child::, and that's probably closer to the intended meaning of the rule.

* Section 5.5, Rule 9, subrule for typed literals: "... a string created
by concatenating the inner content of each of the child elements in
turn...". I do not fully understand the meaning of this sentence. What
is the precise meaning of "inner" in terms of a DOM tree? In my opinion,
"inner" only makes sense if you consider the XML serialization of the
document (i.e.: the substring between the opening tag and the closing
tag). However, the rules are DOM-driven, not serialization-driven.

* Section 5.5, Rule 9 (and also the last two paragraphs of Section
6.3.1.3): this is not a comment, but a question: must the parser descend
recursively when a non-XML literal has been created by concatenating
text nodes? I couldn't find a test case for this. In other words, I'm
not sure which should be the expected outcome of parsing the following
mark-up:

<p about="http://dbpedia.org/resource/Albert_Einstein">
  <span property="foaf:name" datatype="">
     Albert
     <strong property="foaf:familyName">Einstein</strong>
  </span>
</p>

* Section 5.5, Rule 10: "... after processing the child elements, the
context can be restored". I propose to change the sentence to: "...
after processing *each* child element, the context can be restored".
With the current wording, I understand that the context is restored only
once, after *all* child elements are processed, and therefore, *every*
child element share the same context (and this is not desirable).

* Section 9.2: the datatype of @instanceof should be CURIEs (note the
plural).

************

Editorial comments / suggestions:

*  One of my concerns is that the document is not clear enough with
respect to a question which, from my experience, is a FAQ: "can I use
RDFa with arbitrary XML documents?". Although Appendix A makes a fair
effort to clarify this point, however I find the second paragraph of
Section 3.9 a bit confusing: "The aim of RDFa is to allow a single RDF
graph to be carried in an XML document of any type, although this
specification deals specifically with RDFa in XHTML". Please consider
adding a sentence to unambiguously (Yes/No) settle this question, or
make a reference to Appendix A.

* Section "Status of this Document": Incomplete sentence in the third
paragraph: "These include..."

* Section 2.1, description of "@href": the phrase "also an object, but a
resource" may be confusing (is it an object, a resource, or maybe
both?). Please consider a different wording, such as "a resource
object", in order to highlight that it is *a resource acting as an
object in a sentence*.

* Section 3.6. In the example, only predicate URIs are abbreviated,
while subjects and objects are not. Please consider to abbreviate also
subjects and objects. Alternatively, add a note to indicate that the
example illustrates how to abbreviate *some* URIs (those acting as
predicates in the example), and that Turtle allows the remaining ones to
be abbreviated as well.

*  Section 5.1, third paragraph: two kinds of rules are identified here:
those which are "host language-specific", and those which are part of
RDFa. I couldn't find more references to these two kinds in the
document, so the question is: which rules are of the first kind and
which ones are of the second kind? how are they different? how they are
relevant to the processing of an RDFa document?

* Section 5.2: if I understand correctly, the value of "base" never
changes during the process of RDFa parsing. This is a difference with
respect to the rest of the components of the evaluation context. I
suggest to add a sentence to remark that "base" is an invariant.

* Section 5.4.2: what happens if the prefix is void? A reference to
section 7 might be useful here.

* Section 5.4.2, third step: instead of "Combine", I propose "Concat"
("combine" might be a bit ambiguous).

* Section 5.4.4: unmatched end bracket at the end of the second paragraph.

* Section 9.2: the values of @property, @rel and @rev must match
"reserved word | CURIEs". Therefore, it is impossible to use more than
one reserved word, or to mix reserved words and CURIEs, although this
might not be obvious for some readers. My suggestion is to add a
sentence making explicit this restriction.

* Section 9.4, description of "cite", second example: s/@property/@rel
(@property makes no sense here because section 9.4 describes values for
@rev/@rel).

* I'm not sure about this one, but I suspect that the @about attribute
of the two examples in Section 9.4 should be a @resource (otherwhise,
which is the object of the triple?). Additionally, please consider
adding a box with the outcome (in RDF) of these examples.

* Please consider using two different background colors for the boxes
which contain XHTML+RDFa and the ones that contain RDF triples (Turtle).
I think it would improve readability.

* Appendix C.1: some references have brackets, but others don't.

* Appendix C.2: the references are not alphabetically ordered.

-- 
Diego Berrueta
R&D Department  -  CTIC Foundation
E-mail: diego.berrueta@fundacionctic.org
Phone: +34 984 29 12 12
Parque Cientifico Tecnologico Gijon-Asturias-Spain
http://www.fundacionctic.org
Received on Tuesday, 29 January 2008 20:12:14 UTC