- From: Mark Birbeck <mark.birbeck@x-port.net>
- Date: Tue, 31 May 2005 15:14:29 +0100
- To: <public-rdf-in-xhtml-tf@w3.org>
Hi all, As Steven promised, the following are my attempts to make some progress on what is hopefully the only remaining issue in our RDF-in-XHTML2 story. As it stands the problem is that we need bnodes, but we don't quite know how to mark them up. Why do we need them? I would say that it's because we don't want people accidentally making statements about things that they shouldn't be making statements about. There may be a more 'correct' set of RDF terminology to explain this, but I'll give you my understanding, which is if I say this: _:a dc:creator "Mark Birbeck" there is no way that someone else can say anything about _:a since they cannot generate a reference to my anonymous nodes. If we move to the world of XML, how is this requirement addressed? In RDF/XML it's addressed by using the attribute nodeID, either as a subject or an object. Since RDF/XML is 'striped' then there is never an ambiguity about whether we are dealing with a subject or an object so we only need one attribute. In XHTML 2 it's not yet addressed, but it's clear that if we were to use attributes to solve it, we would need two of them, since the syntax for carrying RDF in XHTML 2 allows one element to be used to represent an entire triple (we therefore need to differentiate between subjects and objects). I have gone very deeply into a number of possible solutions, and will re-present them now. The first one I proposed publicly was the XPointer solution, although it wasn't obviously the first one I investigated--that was simply having two attributes. The next solution I proposed was making all @ids anonymous. I'll look at all three now: * two attributes method; * 'reverse-@id' method; * XPointer method. TWO ATTRIBUTES This appears to be the easiest, and short of acceptance of anything else, is probably acceptable to myself and Steven. The solution would be to mirror RDF/XML in having an attribute that indicates bnodes, but as pointed out earlier, we would need two of them (ignore the names in the examples below). For example, a statement about an anonymous node: <meta subjectbnode="a" property="dc:creator">Mark Birbeck</meta> _:a dc:creator "Mark Birbeck" . and a statement using an anonymous node as an object: <link rel="dc:creator" objectbnode="b" /> <> dc:creator _:b . However, given it's so easy, why have we spent so much time and effort trying to find a better solution? The main reason is that it doesn't really feel right to augment XHTML 2 with something that is actually relevant to the domain of RDF. That probably sounds wrong, since we have provided a means to carry RDF within XHTML 2! But the difference is that everything we have done along this road in the last year or more has been to try and find ways for the *ordinary* usage of XHTML to produce RDF. That's why we have leveraged <meta>, <link> and @rel--we haven't invented new tags and attributes. The fact that some nodes should not be addressable outside of the document feels to me like a problem to do with triple stores and the use of URIs as the identifiers for the subjects of statements, and so if there is a solution to the problem that is outside of XHTML 2 then that is to be preferred. If we have to add two new attributes, we will, but I'll also comment a little more on whether this solution is really as good as people think. For a start, we are asking for XML to have two ways of naming things, dependent on how they will be referred to. This isn't the case in RDF/XML because it makes no claims other than being a serialisation for RDF. There is no XML @id for example, so there is no confusion or mixing up of levels. However, XHTML is not primarily a serialisation language for RDF, but a language that carries semantics which we hope to be able to easily map to RDF. It *does* have a way of naming nodes already, and it seems very odd to introduce two ways of naming those nodes. REVERSE-@ID Another proposal I posted to this list was the idea that we flip everything on its head, and any reference to a node in the document was *always* anonymous. This would give us this: <link rel="dc:creator" href="#b" /> <span id="b">...</span> <> dc:creator _:b . A serialiser would simply 'know' that fragments referring to named nodes are actually references to anonymous nodes. This would mean that the serialiser would have to look at the target of the statement to see what was being referred to (although actually that's just an ID look-up). Note that this does not stop other people making statements about my elements--it simply stops your statements and mine running in together. So if the Creative Commons document has a key paragraph: <div id="keyp"> <meta property="dc:creator">John Doe</meta> </div> we can all make references to it: <link rel="cc:xx" href="http://...#keyp" /> but they will not be merged with any statements made inside the target document: _:keyp dc:creator "John Doe" . <http://example.com/mydoc> cc:xx <http://...#keyp> . The idea was that if an author really wanted their statements to 'merge' with the rest of the worlds' statements, then they would do this: <div about="#keyp"> <meta property="dc:creator">John Doe</meta> </div> <http://...#keyp> dc:creator "John Doe" . <http://example.com/mydoc> cc:xx <http://...#keyp> . One of the key advantages of this is that the author is generally creating 'local' metadata by default. I say 'generally' because the metadata about the document is still 'global', and will 'merge' with other data: <head> <meta property="dc:creator">Mark Birbeck</meta> </head> It's just now, any use of 'second-level' metadata will be 'local' unless made explicit. Another advantage of this technique is that it distinguishes between making further statements about the same subject, and making further statements about an HTML node. For example, these are all statements about the same thing: <div about="#keyp"> <meta property="dc:creator">John Doe</meta> </div> ... <div about="#keyp"> <meta property="dc:date">2005-05-31</meta> </div> And this is a statement about the mark-up: <div id="a" about="#keyp"> <meta property="dc:creator">John Doe</meta> </div> ... <div about="#a"> <meta property="dc:creator">Jane Doe</meta> </div> This says that the actual license (#keyp) was updated by John Doe, and the node in the HTML document (#a) that carries this meta-information, was updated by Jane Doe. However, the big drawback is that we want to use @id to identify the target of a link traversal in XHTML, at the same time as allowing authors to make statements about the target: <a rel="cc:xx" href="http://...#keyp">license</a> This still works, but over in the document itself, the author is unable to make any statements about the thing being linked to, since the presence of @id makes all of their further statements about the anonymous node. To put it a different way, you can't use the most basic of the XHTML document cross-referencing mechanisms, at the same time as making publicly available statements about your nodes. XPOINTER Which led me back to my first proposal, in the draft of RDF/A--the XPointer one! The syntax is like this: <link rel="dc:creator" href="#bnode('b')" /> <> dc:creator _:b . Although the XPointer expression has the form of a URI, the XPointer should first be de-referenced, so what is actually stored in the triple store is the *result* of executing the XPointer function, not this unaltered URI. It's essentially a 'request' to the serialiser to come up with a different URI than the one it would come up with ordinarily. People have objected to it on "aesthetic" grounds, and on the basis that it creates confusion. Of course the aesthetic side is obviously going to be a matter of taste, although I will say that using an XPointer scheme does have the merit of taking the problem out of the realm of mark-up, and recognises the whole process for what it is--a request to the serialiser to 'cloak' the URI. It also saves inventing more attributes, which I think we should try to avoid if we can; at the moment incorporating RDF/A into XHTML 2 involves small, incremental changes, but introducing two new attributes creates the possibility of confusion even amongst those that don't even need to use the new attributes. As to the comments about confusion, I have to say that I don't agree. Firstly, I think the people who will usually be using bnodes will easily understand this. And secondly, the confusion would be with RDF/XML, and not RDF 'abstract', and I don't feel that there is an obligation to provide a continuity. CONCLUSION My original arguments for the XPointer approach were based on a lot of work to try and find a solution, and having tried a number of alternatives to accommodate various objections to it, I'm afraid I've been repeatedly drawn back to it. I'd recommend that people reflect on the issues a little, and then we just adopt (flip a coin? ;)) one or other solution. Regards, Mark Mark Birbeck CEO x-port.net Ltd. e: Mark.Birbeck@x-port.net t: +44 (0) 20 7689 9232 w: http://www.formsPlayer.com/ b: http://internet-apps.blogspot.com/ Download our XForms processor from http://www.formsPlayer.com/
Received on Monday, 6 June 2005 11:23:38 UTC