- From: Mark Birbeck <mark.birbeck@x-port.net>
- Date: Sun, 6 Jan 2008 21:30:45 +0000
- To: "Manu Sporny" <msporny@digitalbazaar.com>
- Cc: "RDFa mailing list" <public-rdf-in-xhtml-tf@w3.org>
Hi Manu,
Thanks for an excellent review.
> I spent a good deal of time looking at the processing model and found a
> couple of minor issues with it.
>
> Issue #1:
> The biggest one was in how the "new subject"/"current subject" is set
> and used for completing incomplete triples (hanging rels). I think there
> are some nasty side-effects in the way "new subject" is set and used to
> initialize "current subject". Most notably, it looks like this:
>
> <div resource="#betty" rel="foaf:knows">
> <span resource="#fred"></span>
> </div>
>
> generates the following triple:
>
> <#betty> <foaf:knows> <#fred> .
No, this still generates:
<> <foaf:knows> <#betty> .
as usual. The key to whether an attribute is a subject or an object
(or both) is its relationship to other resources. This is unavoidable
if we are to do proper chaining.
Start with the simplest example:
<div about="#betty" rel="foaf:knows" resource="#fred"></div>
Now say that we want to do some chaining:
<div about="#betty" rel="foaf:knows" resource="#fred">
<div rel="foaf:knows" resource="#manu"></div>
</div>
Since we want to be able to support all sorts of cut-and-paste
permutations, we allows support for 'incomplete triples', so that the
following syntax has the same meaning:
<div about="#betty" rel="foaf:knows">
<div about="#fred" rel="foaf:knows" resource="#manu"></div>
</div>
That's nice, because if we go back to the beginning:
<div about="#betty" rel="foaf:knows" resource="#fred"></div>
we could have simply wrapped this with another statement and all would
have been well:
<div about="#manu" rel="foaf:knows">
<div about="#betty" rel="foaf:knows" resource="#fred"></div>
</div>
Powerful cut-and-paste features, in my view.
Now, let's make the hierarchy clearer:
<div about="#betty" rel="foaf:knows">
<div resource="#fred"></div>
</div>
Again, no problem there. But what if I cut-and-paste a relationship
between Fred and you:
<div about="#betty" rel="foaf:knows">
<div resource="#fred">
<div rel="foaf:knows" resource="#manu"></div>
</div>
</div>
Since the whole point of chaining is that a resource can be both a
subject and object at certain times, there is no reason that this
should no be parsed, and the middle @resource is both an object and a
subject. But thanks to the power of cut-and-paste, we're left with the
possibility that the author may remove the outer statement concerning
Betty, which would leave:
<div resource="#fred">
<div rel="foaf:knows" resource="#manu"></div>
</div>
In my view that should still be valid. (And in fact if it isn't, the
whole chaining thing falls down.)
Note that the key to the whole thing is the presence of @rel or @rev
on the same element as an attribute.
> Issue #2:
> The "recurse" flag is disabled after a triple is generated. This makes
> the parser stop entirely when the first triple is generated - which
> isn't what we want...
It's not quite as bad as you say, but you are right that there is a
flaw in the logic--thanks for spotting it. :)
My thinking was that since it doesn't make any difference if you
unconditionally switch off recursion in all of the following
situations:
<div property="dc:title">
E = mc<sup>2</sup>: The Most Urgent Problem of Our Time
</div>
<div property="dc:title" datatype="rdf:XMLLiteral">King Lear</div>
<div property="dc:title">Macbeth</div>
then we might as well turn off [recurse] whenever there is a
[property] value. Unfortunately, recursion should NOT be turned off if
the object literal was obtained via @content, so I'll fix that,
thanks.
> Issue #3:
> @instanceof generates a new bnode, even if @about is present? That's how
> I interpreted the processing rules (see HTML file).
Do you mean because I haven't stressed that only one of the rules
applies? If so, I think that's the same point that Ivan raised, and
should be resolved now.
> -----------------------------------------------------------------------
> There were several detail-oriented things that confused me in Section 7,
> the CURIE specification (see HTML file).
> -----------------------------------------------------------------------
> The dbpedia namespace should have a more verbose namespace abbreviation,
> some might confuse p: with a property on the <p> HTML tag in the examples.
>
> p: http://dbpedia.org/property/
>
> Perhaps the following should be used:
>
> dbp: http://dbpedia.org/property/
>
> Or:
>
> ped: http://dbpedia.org/property/
> dbpedia: http://dbpedia.org/property/
Fair point. I've gone for 'dbp'.
> There is a lot of material about RDF on the web, and a growing range of tools
> that will support RDFa...
Done.
> Some open technical issues are also identified with the same markup. These
> include an open issue on the interpretation of @instanceof when @about is
> present, and These include the handling of some unprefixed @rel and @rev
> values, whether @src sets the subject, and whether @instanceof can apply
> to @resource.
The unprefixed @rel/@rev issue has been resolved, and I thought @src
had too. I've added the @resource one that you mention, though.
> [It's also a gigantic pain in the ass to author RDF/XML by hand...]
:) Would you like a comment to that effect added? Although I obviously
think that XHTML+RDFa is easier to code than RDF/XML, I hadn't put
that in since it seems like a value-laden observation. What do others
think?
> @src [ This is currently under debate, @src might be used to set the subject -
> should be marked via an editor's note ]
I thought this was resolved, although Ivan's view seemed to be that
@src is no longer an object, which is different to how I perceived it.
> <html
> xmlns="
> http://www.w3.org/1999/xhtml"
> xmlns:bib="http://example.org/"
> [This should be xmlns:biblio=http://example.org/biblio/0.1/ to match the URL
> provided later in the document]
You and Ivan both have eagle-eyes! Thanks.
> ...and the RDF Sytax Document [RDF-SYNTAX].
Done.
> URIs are most commonly used to identify web pages, but RDF makes use of them
> as a way to provide unique identifiers for concepts. For example, we could identify
> the subject of all of our statements (the first part of each triple) by using the
> DBPedia [?ref] [Where's this reference?] URI for Albert Einstein, instead of the
> ambiguous string 'Albert':
Added.
> Here 'p:' has been mapped to the URI for DBPedia, and 'foaf:' has been mapped
> to the URI for the 'Friend of a Friend' taxonomy.
> [p: is too short and could be confusing - use dbp: instead]
Done.
> There MUST be a DOCTYPE declaration in the document prior to the root element.
> If present, the public identifier included in the DOCTYPE declaration must reference
> the DTD found in Appendix B - XHTML+RDFa Document Type Definition using its
> Public Identifier. The system identifier may [Should this be MAY] be modified
> appropriately. [Is the DOCTYPE strictly required, I thought we discussed that it
> SHOULD be there, not MUST be there... what if someone wants to cut/paste a
> snipped of RDFa into their HTML document?]
This is a tricky one. To be fully XHTML conformant the DOCTYPE is
needed, but that doesn't mean that some processor couldn't make use of
a document that doesn't contain the DOCTYPE. This may need further
discussion though, if it is confusing.
> For further information on using media types with XHTML family markup languages,
> see the informative note [XHTMLMIME]. [Just curious - why are we using this
> instread of "application/xhtml+rdfa"? Isn't xhtml+xml sort of redundant? I'm guessing
> that it's because browsers wouldn't recognize xhtml+rdfa?]
>From the XHTML 2 Working Group side of things, XHTML now includes
RDFa, so there would be no distinction.
> A conforming RDFa Processor MUST make available to a consuming application a
> single RDF [graph] containing all possible triples generated by using the rules in the
> Processing Model section. This is the 'default [graph]'. [Should this be [default
> graph]?]
I don't think so; 'graph' is a defined term in RDF, which is what I'm
trying to highlight here. However, I don't think there is a notion of
'default graph', except perhaps in SPARQL.
> if @instanceof is present and @about is not present, then [new subject] is set to be
> a [bnode];
As per Ivan's comments I've tried to clarify that in this block the
first matching rule applies.
> by using @resource, if present. ... [I thought @resource cannot set the subject on
> the current element, which is what effectively happens in the next step. I thought
> @resource could only set the RDF object, as stated earlier in the document in
> Section 2.1. Should this be called [chained object]?]
I agree that the wording in 2.1 needs tightening up a little, but I
believe this rule to be the only way to be consistent with chaining.
> if [new subject] was set to a non-null value in the previous step, it is now used to:
>
> complete any incomplete triples;
> furnish a new value for [current subject]. [Doesn't this mean that this:
> <div resource="#betty" rel="foaf:knows"><div resource="#jack"></div></div>
> would generate: <#betty> <foaf:knows> <#jack> . -- We don't want that, do we?]
No. The key is the presence of @rel or @rev. (See also the notes at
top of email.)
> ... If [direction] is 'forward' then the following triple is generated: [There should be
> a clear distinction between [new subject] and [chained object] -- otherwise we end
> up with the resource generating triples, as shown above.]
See notes at the top.
> subject the [current subject] predicate the predicate from the iterated incomplete
> triple object [new subject]
> If [direction] is 'reverse' not 'forward' then this is the triple generated:
I've been going backwards and forwards between using:
[forwards] == true
and:
[direction] == 'forwards'
Either way, I generally prefer to have one 'true' condition, and then
a negation to express the opposite condition.
> Once all 'incomplete triples' have been resolved, [current subject] is set to [new
> subject]. [This is problematic, see comments about [chained object] above...]
See above.
> that is not present, @src is used [It is currently under debate as to whether @src
> should set the subject or the object] . If none of these are present but @rel or
> @rev is present, then [current object resource] is set to null.
Well...my point is that it could set both, depending on its position.
> ...a string created by concatenating the text nodes and inner content of each of
> the child elements in turn, of the [current element]. The final string includes the
> datatype, as described in [RDFCONCEPTS].
Good point. I need to do a bit more on this anyway, though, since we
agreed that we're going to use the XPath wording.
> Once object resolution is complete, the processor will have two objects, one
> a resource and the other a literal
Done.
> ...Once the triple has been created, the [recurse] flag is set to false. [If the recurse
> flag is set to false at this point, no other triples will be generated from child elements,
> correct? This isn't what we want, is it? I thought we wanted to disable the "recurse"
> flag only when [XML Literal] was the datatype of the current object?]
See notes above.
> If the [recurse] flag is true, all elements that are children of the [current element] are
> processed using the rules described here [But the recurse flag is never true at this
> point, it is always false after a triple is generated!].
Only if there was an object literal, so it's not _quite_ as bad as you
think. But you are right that some 'correct' use-cases won't get
processed properly with my current rules, such as the nested <div> in
this example:
<div
about="#manu"
property="foaf:name" content="Manu"
rel="foaf:knows"
>
<div about="#mark" />
</div>
I.e., the presence of @property inhibits further processing, even
though the object literal is provided by @content, rather than the
element content. (The same goes for the use of @datatype="".)
> dbp: http://dbpedia.org/property/
Done.
> @instanceof is unique in that it sets both a predicate and an object at the same
> time, and inline content might set an object if @content is not present, but
> @property is [I thought @instanceof only applied to @about?].
This seems to be a big source of confusion. In my motivation for
@instanceof to behave in the way I proposed, I was trying to argue
that it should apply to the subject of a triple, and then separately
we'd have rules that set the subject. So yes, @about can set the
subject, but so could @src, @resource or @href when they occur on
their own. (This is why I tried to establish the chaining rules before
clarifying the behaviour of @instanceof in the previous long debates.)
> <p about="#bbq" instanceof="cal:Vevent">[Should @instanceof be highlighted
> in red here too?]
Done.
> As described in the previous two sections, @about will always take precedence and
> mark a new subject, but if no @about value is available then @instanceof will do the
> same job, although using an implied identifier, or bnode. [This is a bit confusing... do
> you mean "using an implied identifier, which is a bnode", here? If so, couldn't you
> just say "using a bnode"?]
How about "i.e., a bnode"? I'd like to keep the word 'implied' because
I'm trying to draw attention to the commonality between setting an
identifier _explicitly_ and setting one _implicitly_.
> In this situation, all statements that are 'contained' by the object resource
> representing Germany (the value in @resource) will have the same subject, making
> it easy for authors to add additional statements: [While I agree with allowing the
> author to do this, I don't see how we prevent the following from happening: <div
> resource="#betty" rel="foaf:knows"><span resource="#phil"></span></div> -- and
> I'm pretty sure we never talked about allowing something like this to happen...
> maybe I wasn't there for that discussion?]
There is no problem here; @resource will be an object when @rel is
present, so all you'll get is this:
<> foaf:knows <#betty> .
Also, the inner @resource will have no effect.
> Note also that the same principle described here applies to @src and @href.
> [So doesn't this mean we can also do: <div href="#betty" rel="foaf:knows"><span
> href="#phil"></span></div> and it would generate <#betty> <foaf:knows> <#phil>
> . ?]
No. It would generate:
<> foaf:knows <#betty> .
as usual. The key is the presence of @rel or @rev.
> In this example there is This example starts with one incomplete triple:
I'm not quite sure what you're getting at here. I've left it unchanged
for now, but feel free to explain further.
> For example, when @instanceof creates a new bnode (as described above), that will
> be used to complete any incomplete triples' [Is that trailing ' gramatically correct - I
> don't know...].
Eagle-eyes... :) Actually the problem is a missing apostrophe in front
of the word 'incomplete', but well spotted.
Thanks again for a very thorough review, Manu, it's much appreciated.
Regards,
Mark
--
Mark Birbeck, formsPlayer
mark.birbeck@formsPlayer.com | +44 (0) 20 7689 9232
http://www.formsPlayer.com | http://internet-apps.blogspot.com
standards. innovation.
Received on Sunday, 6 January 2008 21:30:52 UTC