Response to Ivan's Comments on RDFa Syntax Last Call-ready Draft (Feb 17th, 2008)

Ivan Herman sent in some comments and issues with the RDFa Syntax Last
Call-ready Draft (Feb 17th, 2008). What follows are responses to his
comments and pointers to the diff-marked draft where changes were made
as a result to his issues.

Ivan Herman wrote:
> Sorry, this is a minor editorial and highly personal/taste issue: the
> colour combination in 3.3 and elsewhere in the Turtle examples (the very
> dark blue with red highlights) is particularly difficult to read (at
> least for me). Personally, I would prefer a yellowish/khaki background...

Sorry Ivan, we're going to keep the color combination as is for right
now. We will change it during last call if the color combination bothers

> -------------
> 4.3. says now (to my surprise):
> [[[
> Since XHTML+RDFa is based upon XHTML Modularization [XHTMLMOD], and
> since XHTML Modularization requires that whitespace is preserved,
> conforming processors must preserve whitespace in both [plain literal]s
> and [XML literals]. However, it may be the case that the architecture in
> which a processor operates does not make all whitespace available. It is
> therefore advisable for authors who would like to make their documents
> consumable across different processors, to remove any unnecessary
> whitespace in their mark-up.
> ]]]
> Isn't this against what we decided for plain literals (or maybe I
> misunderstand something here)? I thought that whitespace preservation
> was restricted to XMLLiterals, but plain literals get canonicalized.
> And I do believe that this is necessary: various authoring tools (Amaya,
> GoLive) or tools like tidy, do break/wrap lines sometimes in a fairly
> difficult-to-control way.... Also, my (limited) RDFa authoring
> experience is that this may lead to problem. Indeed, I realized that I
> often do an additional structuring of my HTML code along RDFa lines, so
> to say. Ie, I write
>   <span about="#A" property="bla:bla">
>      the text here, mainly when it is long and spans over several
> paragraphs; new lines may also appear
>   </span>
> to very clearly delimit the RDFa structures. This rule would force me to
> write this in a particular way:
>   <span about="#A" property"bla:bla">the text here, mainly when it is
> long and spans over several paragraphs; new lines may also appear</span>
> otherwise I would get extra whitespaces in the RDF literal which I do
> not want!
> (Note that test cases may have to be adapted to this, if the group
> decides to keep it as it is)

We ask that whitespace is preserved because the host language, XHTML
(XML, really), requires this. We made the decision that it is the
application's job to normalize whitespace if that is necessary, not the
parser's job.

There is a discussion with a resolution available in our minutes, here:

Search for "whitespace normalization"

There has also been text added to the Syntax document to outline the
importance of paying attention to whitespace normalization in your
application code:

> --------------
> Note/comment on our (perennial:-) issues on @rel/@rev values...
> The current text makes it very clear that @rev="blabla" means a full URI
> _with the possible values_, and those values are listed in 9.4. Ie, if
> "blabla" is not predefined in 9.4, it is dropped. That is what we wanted.
> However... if I read the CURIE spec in spec 7, and I want to give a
> mechanical interpretation of @rel=":blabla", well, the only thing it
> says is that the default @prefix mapping is
> so that value should be mapped onto:
> The text does _not_ say that the processor should check, in this case,
> that the value of #blabla is valid for the @rel value! I don't mind
> that, this gives us a leeway if, by any chance, the possible values
> evolve at some point later, but I just wanted to point this out.

Noted, thank you.

> --------------
> I also found out that @property has now a bunch of predefined values. I
> thought we would not do that, or at least the group was oscillating
> between yes and no. I do not have strong opinions on that; actually,
> predefined @property values do make some sense (well, although that
> would make sense for a @name attribute, @property is not an XHTML
> attribute).
> However. _If_ the group decides to keep the predefined @property
> attributes, then:
> - the same description should be provided on the mapping of @property
> CURIE values as for @rel/@rev
> - we should add test cases along the same lines...

We have removed the pre-defined values for property as that is a hold
over from an older draft of the document. Note that these values don't
exist in the current draft (the old section 9.3 discussed reserved words
for property):

Note that this is a different reply than I had given you before. When
Mark read my e-mail, he said that his recollection was the same as
yours. After digging around in the minutes and reasoning things through,
we were able to find the correct answer.

> =================== Processing step comments =============
> Processing step #7 and #8, this is purely editorial: the white line
> between the two blocks was a bit misleading. It took me a certain time
> to realize that the first sentence "Predicates for the [current object
> resource] can be set by using one or both of the @rel and @rev
> attributes." refer to _both_ boxes, the first giving details on @rel
> and the second on @rev... The first sentence should probably be
> separated visually.

Fixed, please see Step #7 and Step #8:

> -------------
> Step 7 seems to be wrong. Isn't it so that those triples should be
> generated with [new subject] and _not_ [current subject]?

You are correct. The change has been made.

See Step #7:

> In fact, the only reason of sending the values of [current subject]
> 'down' the recursion is for the completion of incomplete triples. I
> wonder whether it may not make it simpler to say in step 8 that the
> 'data structure' of incomplete triples are built by adding the value
> of [new subject] as a possible subject/object in the data structure,
> and this is sent down (possibly across several steps) to the
> descendent. I have the feeling that it would simplify, eg, step 10.

While it may simplify that step, there has been quite a bit of thought,
implementation, and review put behind the current set of processing
rules. We are very reluctant to change them due to a potential
optimization. At the moment, the only items that must be stored are the
predicate and the direction. Those two pieces of information along with
'parent subject' and 'parent object' gives you everything else you need
to complete the triples.

We have not made the potential optimization.

> --------------
> I wonder whether step 10 does not go wrong when going through an
> element that has absolutely no RDFa attribute. Indeed, incomplete
> triples will be lost: in the setting of the new context that is set to
> the [local list of incomplete triples] which is initialized to nil. On
> the other hand,
> <div about="#a" rel="a:b">
>   <div>
>     <span rel="c:d" resource="#c">
> _should_ generate the
> <#a> a:b [ c:d <#c> ]
> triples.
> In my coding experience it might simplify the description if we
> cleanly separated the case when there is an element without any RDFa
> attribute.
> The evaluation context is simply forwarded and that is it. It may
> simplify the description elsewhere, too

You are correct, Ivan. Changes have been made to address this bug in the
processing rules.

Note that there is now a [skip element flag] defined in the processing
rules. If this flag is true, then the context that was passed into the
step is copied and passed down after the local language and local list
of URI mappings is copied.

See Step #1, #4, #11, and #12:

> -------------
> As for step 10: I think that this step, as described, should stay
> regardless of whether we keep the management of return value (see the
> separate small discussion of this morning).

Done! :)

> -------------
> The fact whether [new subject] and [parent object] is set or whether
> it remains null affects the whole chain of processing steps. I may
> have misunderstood something here, but it looks to me as if these
> values were _almost never_ null, except for the <html> element!
> Indeed,
>  - Steps 4 and 5 say that on the <head> and <body> elements the value
>    is (possibly) set to @about="" (unless set otherwise)
>  - Steps 4 and 5 say that it is set on the value of [parent object] as
>    a fallback
>  - Step 10, prior to recursion, sets the [parent object] to either
>    [current object resource] or [new subject]
> Ie, the way I see it: on <html> [parent object] is indeed nil, [new
> object] will also stay nil; then the <head> and <body> elements are
> managed and, in both cases, [new subject] _will_ be set to a non-nil
> value. As a consequence, by virtue of Step 10, this will be the case
> for all the descendents.
> What this tells me is that the processing steps could be greatly
> simplified by taking that into account. We could actually say that
> <html> also behaves like <head> and <body> in terms of using @about=""
> and, from that point on, we always have a value. Look, eg, the setting
> of the [parent object] in step 10...
> I wonder about the setting of [current subject] in step 10 in this
> respect, though...

You are probably correct. Mark took a look at this but didn't see
anywhere he could make a simplification without endangering the
integrity of the processing rules. We have made all optimizations that
we feel comfortable with and are currently not comfortable with an
optimization related to [new subject] and [parent object].

> -------------
> End of step 5 says:
> "Note that final value of the [current object resource] will either be
> null (from initialization), a full URI, or a bnode."
> This statement is not correct in that branch. The rules listed there
> clearly say that a BNode is set to [current object resource] as a
> fallback, ie, [current object resource] is never null there.
> As a result, the handling of incomplete triples becomes false, too,
> because, in the presence of @rel/@rev, there is no branch leading to
> new incomplete triples...
> I try to copy here the approach I took in my code, and translate it,
> so to say, to the processing steps here. Maybe this helps
> - [current object resource] is initialized to null
> - in step 4 (ie, when there is no @rel/@rev) [current object resource]
> is set to the value of [new subject]
> - in step 5, there is _no_ BNode fall back to set [current object
>   resource]
> - before step 10, if [current object resource] is null, _then_ a new
>   BNode is generated and used to set [current object resource]
> - in step 10, [parent object] is set to [current object resource] (not
>   that, by virtue of the step above, the value might be [new subject],
>   which is all right, this is the case when no @rel/@rev are present)

You're correct Ivan. Mark has fixed this bug in the processing rules.
[current object resource] can now be null again until it hits Step #8,
thus triggering the proper generation of the list of incomplete triples.

Please note Step #5 and Step #8:

-- manu

Manu Sporny
President/CEO - Digital Bazaar, Inc.
blog: RDFa Basics in 8 minutes (video)

Received on Monday, 18 February 2008 22:02:35 UTC