- From: Mark Birbeck <mark.birbeck@x-port.net>
- Date: Sun, 6 Jan 2008 21:30:45 +0000
- To: "Manu Sporny" <msporny@digitalbazaar.com>
- Cc: "RDFa mailing list" <public-rdf-in-xhtml-tf@w3.org>
Hi Manu, Thanks for an excellent review. > I spent a good deal of time looking at the processing model and found a > couple of minor issues with it. > > Issue #1: > The biggest one was in how the "new subject"/"current subject" is set > and used for completing incomplete triples (hanging rels). I think there > are some nasty side-effects in the way "new subject" is set and used to > initialize "current subject". Most notably, it looks like this: > > <div resource="#betty" rel="foaf:knows"> > <span resource="#fred"></span> > </div> > > generates the following triple: > > <#betty> <foaf:knows> <#fred> . No, this still generates: <> <foaf:knows> <#betty> . as usual. The key to whether an attribute is a subject or an object (or both) is its relationship to other resources. This is unavoidable if we are to do proper chaining. Start with the simplest example: <div about="#betty" rel="foaf:knows" resource="#fred"></div> Now say that we want to do some chaining: <div about="#betty" rel="foaf:knows" resource="#fred"> <div rel="foaf:knows" resource="#manu"></div> </div> Since we want to be able to support all sorts of cut-and-paste permutations, we allows support for 'incomplete triples', so that the following syntax has the same meaning: <div about="#betty" rel="foaf:knows"> <div about="#fred" rel="foaf:knows" resource="#manu"></div> </div> That's nice, because if we go back to the beginning: <div about="#betty" rel="foaf:knows" resource="#fred"></div> we could have simply wrapped this with another statement and all would have been well: <div about="#manu" rel="foaf:knows"> <div about="#betty" rel="foaf:knows" resource="#fred"></div> </div> Powerful cut-and-paste features, in my view. Now, let's make the hierarchy clearer: <div about="#betty" rel="foaf:knows"> <div resource="#fred"></div> </div> Again, no problem there. But what if I cut-and-paste a relationship between Fred and you: <div about="#betty" rel="foaf:knows"> <div resource="#fred"> <div rel="foaf:knows" resource="#manu"></div> </div> </div> Since the whole point of chaining is that a resource can be both a subject and object at certain times, there is no reason that this should no be parsed, and the middle @resource is both an object and a subject. But thanks to the power of cut-and-paste, we're left with the possibility that the author may remove the outer statement concerning Betty, which would leave: <div resource="#fred"> <div rel="foaf:knows" resource="#manu"></div> </div> In my view that should still be valid. (And in fact if it isn't, the whole chaining thing falls down.) Note that the key to the whole thing is the presence of @rel or @rev on the same element as an attribute. > Issue #2: > The "recurse" flag is disabled after a triple is generated. This makes > the parser stop entirely when the first triple is generated - which > isn't what we want... It's not quite as bad as you say, but you are right that there is a flaw in the logic--thanks for spotting it. :) My thinking was that since it doesn't make any difference if you unconditionally switch off recursion in all of the following situations: <div property="dc:title"> E = mc<sup>2</sup>: The Most Urgent Problem of Our Time </div> <div property="dc:title" datatype="rdf:XMLLiteral">King Lear</div> <div property="dc:title">Macbeth</div> then we might as well turn off [recurse] whenever there is a [property] value. Unfortunately, recursion should NOT be turned off if the object literal was obtained via @content, so I'll fix that, thanks. > Issue #3: > @instanceof generates a new bnode, even if @about is present? That's how > I interpreted the processing rules (see HTML file). Do you mean because I haven't stressed that only one of the rules applies? If so, I think that's the same point that Ivan raised, and should be resolved now. > ----------------------------------------------------------------------- > There were several detail-oriented things that confused me in Section 7, > the CURIE specification (see HTML file). > ----------------------------------------------------------------------- > The dbpedia namespace should have a more verbose namespace abbreviation, > some might confuse p: with a property on the <p> HTML tag in the examples. > > p: http://dbpedia.org/property/ > > Perhaps the following should be used: > > dbp: http://dbpedia.org/property/ > > Or: > > ped: http://dbpedia.org/property/ > dbpedia: http://dbpedia.org/property/ Fair point. I've gone for 'dbp'. > There is a lot of material about RDF on the web, and a growing range of tools > that will support RDFa... Done. > Some open technical issues are also identified with the same markup. These > include an open issue on the interpretation of @instanceof when @about is > present, and These include the handling of some unprefixed @rel and @rev > values, whether @src sets the subject, and whether @instanceof can apply > to @resource. The unprefixed @rel/@rev issue has been resolved, and I thought @src had too. I've added the @resource one that you mention, though. > [It's also a gigantic pain in the ass to author RDF/XML by hand...] :) Would you like a comment to that effect added? Although I obviously think that XHTML+RDFa is easier to code than RDF/XML, I hadn't put that in since it seems like a value-laden observation. What do others think? > @src [ This is currently under debate, @src might be used to set the subject - > should be marked via an editor's note ] I thought this was resolved, although Ivan's view seemed to be that @src is no longer an object, which is different to how I perceived it. > <html > xmlns=" > http://www.w3.org/1999/xhtml" > xmlns:bib="http://example.org/" > [This should be xmlns:biblio=http://example.org/biblio/0.1/ to match the URL > provided later in the document] You and Ivan both have eagle-eyes! Thanks. > ...and the RDF Sytax Document [RDF-SYNTAX]. Done. > URIs are most commonly used to identify web pages, but RDF makes use of them > as a way to provide unique identifiers for concepts. For example, we could identify > the subject of all of our statements (the first part of each triple) by using the > DBPedia [?ref] [Where's this reference?] URI for Albert Einstein, instead of the > ambiguous string 'Albert': Added. > Here 'p:' has been mapped to the URI for DBPedia, and 'foaf:' has been mapped > to the URI for the 'Friend of a Friend' taxonomy. > [p: is too short and could be confusing - use dbp: instead] Done. > There MUST be a DOCTYPE declaration in the document prior to the root element. > If present, the public identifier included in the DOCTYPE declaration must reference > the DTD found in Appendix B - XHTML+RDFa Document Type Definition using its > Public Identifier. The system identifier may [Should this be MAY] be modified > appropriately. [Is the DOCTYPE strictly required, I thought we discussed that it > SHOULD be there, not MUST be there... what if someone wants to cut/paste a > snipped of RDFa into their HTML document?] This is a tricky one. To be fully XHTML conformant the DOCTYPE is needed, but that doesn't mean that some processor couldn't make use of a document that doesn't contain the DOCTYPE. This may need further discussion though, if it is confusing. > For further information on using media types with XHTML family markup languages, > see the informative note [XHTMLMIME]. [Just curious - why are we using this > instread of "application/xhtml+rdfa"? Isn't xhtml+xml sort of redundant? I'm guessing > that it's because browsers wouldn't recognize xhtml+rdfa?] >From the XHTML 2 Working Group side of things, XHTML now includes RDFa, so there would be no distinction. > A conforming RDFa Processor MUST make available to a consuming application a > single RDF [graph] containing all possible triples generated by using the rules in the > Processing Model section. This is the 'default [graph]'. [Should this be [default > graph]?] I don't think so; 'graph' is a defined term in RDF, which is what I'm trying to highlight here. However, I don't think there is a notion of 'default graph', except perhaps in SPARQL. > if @instanceof is present and @about is not present, then [new subject] is set to be > a [bnode]; As per Ivan's comments I've tried to clarify that in this block the first matching rule applies. > by using @resource, if present. ... [I thought @resource cannot set the subject on > the current element, which is what effectively happens in the next step. I thought > @resource could only set the RDF object, as stated earlier in the document in > Section 2.1. Should this be called [chained object]?] I agree that the wording in 2.1 needs tightening up a little, but I believe this rule to be the only way to be consistent with chaining. > if [new subject] was set to a non-null value in the previous step, it is now used to: > > complete any incomplete triples; > furnish a new value for [current subject]. [Doesn't this mean that this: > <div resource="#betty" rel="foaf:knows"><div resource="#jack"></div></div> > would generate: <#betty> <foaf:knows> <#jack> . -- We don't want that, do we?] No. The key is the presence of @rel or @rev. (See also the notes at top of email.) > ... If [direction] is 'forward' then the following triple is generated: [There should be > a clear distinction between [new subject] and [chained object] -- otherwise we end > up with the resource generating triples, as shown above.] See notes at the top. > subject the [current subject] predicate the predicate from the iterated incomplete > triple object [new subject] > If [direction] is 'reverse' not 'forward' then this is the triple generated: I've been going backwards and forwards between using: [forwards] == true and: [direction] == 'forwards' Either way, I generally prefer to have one 'true' condition, and then a negation to express the opposite condition. > Once all 'incomplete triples' have been resolved, [current subject] is set to [new > subject]. [This is problematic, see comments about [chained object] above...] See above. > that is not present, @src is used [It is currently under debate as to whether @src > should set the subject or the object] . If none of these are present but @rel or > @rev is present, then [current object resource] is set to null. Well...my point is that it could set both, depending on its position. > ...a string created by concatenating the text nodes and inner content of each of > the child elements in turn, of the [current element]. The final string includes the > datatype, as described in [RDFCONCEPTS]. Good point. I need to do a bit more on this anyway, though, since we agreed that we're going to use the XPath wording. > Once object resolution is complete, the processor will have two objects, one > a resource and the other a literal Done. > ...Once the triple has been created, the [recurse] flag is set to false. [If the recurse > flag is set to false at this point, no other triples will be generated from child elements, > correct? This isn't what we want, is it? I thought we wanted to disable the "recurse" > flag only when [XML Literal] was the datatype of the current object?] See notes above. > If the [recurse] flag is true, all elements that are children of the [current element] are > processed using the rules described here [But the recurse flag is never true at this > point, it is always false after a triple is generated!]. Only if there was an object literal, so it's not _quite_ as bad as you think. But you are right that some 'correct' use-cases won't get processed properly with my current rules, such as the nested <div> in this example: <div about="#manu" property="foaf:name" content="Manu" rel="foaf:knows" > <div about="#mark" /> </div> I.e., the presence of @property inhibits further processing, even though the object literal is provided by @content, rather than the element content. (The same goes for the use of @datatype="".) > dbp: http://dbpedia.org/property/ Done. > @instanceof is unique in that it sets both a predicate and an object at the same > time, and inline content might set an object if @content is not present, but > @property is [I thought @instanceof only applied to @about?]. This seems to be a big source of confusion. In my motivation for @instanceof to behave in the way I proposed, I was trying to argue that it should apply to the subject of a triple, and then separately we'd have rules that set the subject. So yes, @about can set the subject, but so could @src, @resource or @href when they occur on their own. (This is why I tried to establish the chaining rules before clarifying the behaviour of @instanceof in the previous long debates.) > <p about="#bbq" instanceof="cal:Vevent">[Should @instanceof be highlighted > in red here too?] Done. > As described in the previous two sections, @about will always take precedence and > mark a new subject, but if no @about value is available then @instanceof will do the > same job, although using an implied identifier, or bnode. [This is a bit confusing... do > you mean "using an implied identifier, which is a bnode", here? If so, couldn't you > just say "using a bnode"?] How about "i.e., a bnode"? I'd like to keep the word 'implied' because I'm trying to draw attention to the commonality between setting an identifier _explicitly_ and setting one _implicitly_. > In this situation, all statements that are 'contained' by the object resource > representing Germany (the value in @resource) will have the same subject, making > it easy for authors to add additional statements: [While I agree with allowing the > author to do this, I don't see how we prevent the following from happening: <div > resource="#betty" rel="foaf:knows"><span resource="#phil"></span></div> -- and > I'm pretty sure we never talked about allowing something like this to happen... > maybe I wasn't there for that discussion?] There is no problem here; @resource will be an object when @rel is present, so all you'll get is this: <> foaf:knows <#betty> . Also, the inner @resource will have no effect. > Note also that the same principle described here applies to @src and @href. > [So doesn't this mean we can also do: <div href="#betty" rel="foaf:knows"><span > href="#phil"></span></div> and it would generate <#betty> <foaf:knows> <#phil> > . ?] No. It would generate: <> foaf:knows <#betty> . as usual. The key is the presence of @rel or @rev. > In this example there is This example starts with one incomplete triple: I'm not quite sure what you're getting at here. I've left it unchanged for now, but feel free to explain further. > For example, when @instanceof creates a new bnode (as described above), that will > be used to complete any incomplete triples' [Is that trailing ' gramatically correct - I > don't know...]. Eagle-eyes... :) Actually the problem is a missing apostrophe in front of the word 'incomplete', but well spotted. Thanks again for a very thorough review, Manu, it's much appreciated. Regards, Mark -- Mark Birbeck, formsPlayer mark.birbeck@formsPlayer.com | +44 (0) 20 7689 9232 http://www.formsPlayer.com | http://internet-apps.blogspot.com standards. innovation.
Received on Sunday, 6 January 2008 21:30:52 UTC