- From: Mark Birbeck <mark.birbeck@x-port.net>
- Date: Wed, 13 Apr 2005 15:01:38 +0100
- To: <public-rdf-in-xhtml-tf@w3.org>
- Cc: "'HTML WG'" <w3c-html-wg@w3.org>
Hello everyone, One of the remaining issues with XHTML 2 and its metadata story is how to handle bnodes. The current proposal in RDF/A is to use an XPointer mechanism, but it has been described as not being aesthetically pleasing. Obviously, that's a matter of taste, but perhaps if we have a couple more proposals to mull over, it may help us to reach an acceptable decision. First, I'll recap what we want to achieve. ANONYMOUS NODES As you all know, in RDF you can make statements about anonymous nodes by using a bnode. In terms of RDFCONCEPTS, no limit is placed on what form this bnode can take, other than it must not be from the set of string literals, and also not from the set of URIs. To remind ourselves why we would want to do this; we have an anonymous node that has a property of an email address "mailto:mark.birbeck@x-port.net" and another anonymous node that has a property of an email address of "mailto:Steven.Pemberton@cwi.nl" and we want to say that the two people identified by the email addresses know each other. This: _:a foaf:knows _:b would do the trick. BNODES IN XHTML 2 When tackling problems like this, I've tended to look at some syntax and then ask what would an we and HTML authors 'expect' it to mean. So, let's begin with the CC example given at the end of the current RDF/A draft [1]: <p> This document is licensed under a <a rel="cc:license" href="http://creativecommons.org/licenses/by-sa/2.0/"> Creative Commons License </a> which, among other things, requires that you provide attribution to the author, <a rel="dc:creator" href="http://ben.adida.net">Ben Adida</a>. </p> NOTE: RDF/A has changed the inheritance rules, so the current draft is out of date. In the new syntax, this fragment is making statements about *the document* and not the <p> -- so ignore the prose just after the example. Now, we can easily make further statements about Ben in our document. There are a number of ways to do it, but one is this: <a rel="dc:creator" href="http://ben.adida.net">Ben Adida</a>. <meta about="http://ben.adida.net" property="foaf:name" content="Ben Adida" /> However, the big question is, what would we expect to be the meaning of a similar structure that used fragment identifiers: <a rel="dc:creator" href="#ben">Ben Adida</a>. <meta id="#ben" property="foaf:name" content="Ben Adida" /> I think it's clear that '#ben' is an anonymous node, and so intuitively @id is actually the equivalent of @rdf:nodeID. If this were not the case, then we would be saying that the creator of the document were an HTML node, and that HTML node had a foaf:name of "Ben Adida". An interesting thing here is that this is conveniently what a non-RDF person would most likely understand this to mean anyway -- that the document was created by a thing that has the name "Ben Adida". They wouldn't necessarily insert 'thing' in there, but it's extremely unlikely that most authors would not think that whatever it was that created this document, it certainly had the name "Ben". So we've effectively 'slipped in' anonymous nodes without any real trouble. And note that this is *not* what would happen if we introduced some special way of naming anonymous nodes -- such as @bnodeID, or my XPointer proposal -- since then you have to explain why an anonymous node is different to another node. (I'll discuss this a little more below, but I'm effectively saying that the exception is to 'name' a node, not to make it anonymous.) RDF/XML As it happens, we have effectively mirrored the RDF/XML syntax. IN RDF/XML you have two ways to name a node: * use @rdf:about or its abbreviated form, @rdf:ID; * use @rdf:nodeID. However, since RDF/XML uses striping, the attribute rdf:nodeID is used for both the subject and object -- this is not possible in RDF/A. We would therefore need to say something like: * @id is equivalent to @rdf:nodeID as the subject; * @href with a fragment identifier is equivalent to @rdf:nodeID as the object. (The second bullet is qualified, below.) ANONYMOUS OR PARTLY ANONYMOUS? If we accept that this is a better *syntax* than we currently have, the only issue that remains is what exactly should be serialised. There are two straightforward choices: * serialise the URIs 'as is'; * do some conversion with a 'bnode formula'. I favour the second solution, for reasons I'll explain. SERIALISE 'AS IS' One possibility is that the triples generated by a document with @id used in the statements are simply serialised 'as is'. This means in effect there are no bnodes. Every node with an @id becomes a fully referenceable item from other locations. Putting aside the philosophical points from the RDF standpoint, there are actually quite fundamental problems with doing this, which I'll explain in a moment. CONVERT TO BNODES The second solution is, on serialisation, to convert the @id values (and any @href with a fragment identifier) to bnodes. The consequence is that once in a triple store, no other triples outside of the original document could refer to this data -- it would really be anonymous. Note that there is nothing to stop someone referring to this @id from another set of triples. However, by taking the second approach (converting @id to bnodes) we ensure that this external reference is actually about the HTML node and not Ben. For example, it would be legitimate for some editing software to say that this node was created on Friday, but by making the node anonymous we don't end up with the problem that we now have a set of triples that say that Ben was created by some software on Friday. NAMED NODES So, the only bit missing is if I really did want to allow people to make statements about my statements. What if I wanted to actually say that this really is the definitive location for Ben's data? I would say that we already have a mechanism, and that is to use @about: <a rel="dc:creator" href="#ben">Ben Adida</a>. <meta about="#ben" property="foaf:name" content="Ben Adida" /> We are therefore asking authors to *explicitly* say that they want their data to be referenceable from external statements, which I believe is actually more 'correct' for an HTML author. To put it a different way, whilst the RDF author makes all their data accessible, but uses rdf:nodeID when they want to 'hide it', the HTML author who knows nothing of RDF is by default creating 'local' data. It's only if they want to 'publish' their data that they need to understand a little more about RDF, and move from @id, to @about. NOTE: For this to work, we need to be able to distinguish between @href pointing to an anonymous node, and @href pointing to a named node when serialising. However, I think this is possible, and in fact, the only check you need to do is whether an @id exists with that name. If there is no @id, then we serialise as normal, even if there is no @about with that @id. NOTE; We'd need to decide whether @id and @about can exist on the same element. At first sight it looks like it would be best if they didn't. TAG AND URIs Finally, as I said on the call, I don't believe that munging the URIs into bnodes breaks anything at the TAG level. The URI for the *HTML node* is still intact, and @id is still being used in the usual way. What we are saying is that at the level of addressing HTML nodes, @id retains its current and accepted use, but at the level of naming concepts (or metadata for serialisation) we are saying that it does not have the same use, and in fact, I would say that it *cannot* have its current use without causing the "Ben was created by GoLive on Friday" type of problems. Regards, Mark [1] <http://www.w3.org/MarkUp/2004/rdf-a.html#div248219168> Mark Birbeck CEO x-port.net Ltd. e: Mark.Birbeck@x-port.net t: +44 (0) 20 7689 9232 w: http://www.formsPlayer.com/ b: http://internet-apps.blogspot.com/ Download our XForms processor from http://www.formsPlayer.com/
Received on Wednesday, 13 April 2005 14:01:57 UTC