- From: Mark Birbeck <mark.birbeck@x-port.net>
- Date: Mon, 19 Mar 2007 05:31:56 -0700
- To: public-rdf-in-xhtml-tf@w3.org
Hi Ivan, > Wow. The discussion has really become suddenly intense. Cheer up guys, > this is only technology:-) Indeed. Kind of aggressive I feel, given that all I have done is try to find a solution to this problem. Although I can't tell people what to think or how to argue, perhaps I can suggest that they at least take as their starting-point that the plain literal solution is the most obvious, and I would of course favour it if it could be made to work. I spent a lot of time on this a number of years ago, but at the end of the day I felt that ignoring mark-up that is placed in text by authors was just not acceptable. As I have said elsewhere though, there is a modification of the current solution that I can see working, which is that we say that an element with no child element nodes becomes a plain literal, whilst elements with child element nodes become XML literals. This was discussed quite a long time ago by myself and Steven, but we never really pursued it, mainly because we thought people might find it unacceptable. Having said that, the issue of XML literals has only just recently started to be discussed again, so it's only now become prescient. Encouragingly, a few months ago I implemented exactly this algorithm in my RDFa parser, and I believe it to be pretty straightforward. But I still want to emphasise that the solution that simply does not work is to 'flatten' all mark-up in situations where the author has not specifically asked for XML literals. > Anyway... > > Mark, I read all the arguments but, I am sorry to say, you still have > not convinced me, and I still believe that the default should be plain > literal.... Just so that we're clear, the current position in the RDFa spec is of XML literals, so the shoe is actually on the other foot...those opposed to it need to provide convincing arguments for *removing* this behaviour. I have still to hear a good argument for using _only_ plain literals, but I'd also like to hear views on the 'mixed' approach of generating _either_ a plain literal _or_ an XML literal, as appropriate. > There is the 'social' aspect Ian was referring to several times. Whether > we like it or not, most of the RDF-s used out there use plain literals. > I have seen very very rarely graphs with XMLLiteral, in fact, and I > think I have used it only once myself. I do not think it is o.k. if the > RDF graphs resulting from RDFa authoring get such a different flavour; > they should 'blend in' the RDF world. I'm having trouble squaring this with my understanding of RDF. Do you curl up in bed and 'read' RDF graphs? ;) Or are they processed by machines? And how many times have you used xsd:base64Binary? In other words, existing data storage patterns tell us nothing about future ones, features are not less important just because you haven't used them. Etc., etc. There does not seem to be any technical reason why triples that originate from RDFa would have any trouble 'blending' with triples from any other source. > And, although your argumentation > around my old SPARQL issue was logically and technically correct, the > fact still remains that lots of SPARQL queries, though scruffy, work > with some of the assumptions that I made back then (ie, I did not check > the language tag, for example). We cannot ignore that, it *is* part of > the 80/20 cut. And remember: my SPARQL query *did* fail because of that! I'm really not sure how to reply...you seem to be saying that whilst my argument was correct both logically and technically, it is unacceptable because it doesn't factor in your mistaken assumptions about RDF Concepts' notion of equality. If so, then you have me... :) Don't forget, though, that I showed that your query failed _anyway_, regardless of RDFa; the mistaken assumptions you were making about RDF equality were already tripping you up with data from your RDF/XML documents. > [A small remark to Dan Brickley: even if any XMLLiteral is, in fact, a > general Literal according to RDFS, SPARQL endpoints do not necessarily > have an RDFS reasoner. Ie, in practice this relationship will not be > recognized...] It has nothing to do with reasoning; it's the RDF Concepts document that says that *both* plain literals and typed literals are of type 'literal'. And also, SPARQL knows about both types, independent of RDF Schema. > I also have a more technical issue. You convincingly argue with the > Einstein example: > > E = mc<sup>2</sup>: The Most Urgent Problem of Our Time > > where the <sup> tag plays an essential, shall we say, semantic role. > True. But I could just as well use another example, like > > This guy is <em>truly</em> intelligent > > where the author puts in the <em> tag for a visual emphasis only, but > the real "semantics" he/she wants to convey is "This guy is truly > intelligent", in which case the <em> tag really gets in the way (again, > whether this usage of <em> is technically correct or not is besides the > point; this *is* the way it is used many times!). The same holds for a > number of cases: if the text in question is inside a <h1> tag, some sort > of <span>, etc. In all those cases, keeping the extra XML tag in the > graph is counter-intuitive to me. Although I have no statistics, my gut > feeling tells me that these examples are in majority compared to the > Einstein example. What you describe is not accepted or conventional usage of HTML and XHTML, in that nowadays you wouldn't find many people who put 'em' into their mark-up just for some visual effect, whilst not wanting the 'em' to be part of the text's meaning. The trend today is almost exactly the opposite, and is why I keep insisting that if an author has put mark-up into their document, we should preserve as much of it as possible. > Finally, Ian's lingering question is still around: if I *want* a plain > literal, ie, I *want* the system to get rid of the extra xml tags, what > do I do? Does RDFa wants to introduce yet another keyword for this? Why > not follow the default mechanism that is used both by RDF/XML and Turtle > and nobody seems to have a problem with? First, on producing plain literals, there is a way to do it, and that is to use the content attribute. But I would stress that it is the *need* for plain literals that is the edge case, since as I've tried to show, I can't find a situation yet where it makes a difference whether a simple string is represented by a plain literal or an XML literal. But second, you are moving the goalposts when you say that you "*want* the system to get rid of the extra xml tags". Where did that requirement come from? I feel the need to stress again that we are dealing here with XHTML authors, and not RDF ones. Which XHTML authors will want plain literals, or even know what they are? And who would want their mark-up 'flattened' rather than remaining as mark-up? And finally, we've agreed that we are not trying to support all of RDF, so if it's the case that it is not possible to create plain literals (which is not the case, but just say it was) then that is not in itself an argument against making XML literal the default, unless it can be shown that plain literals are needed for some significant use case. > I am not 'officially' part of the Working Group, for obvious reasons, > but I would think that this is an issue that, eventually, should be > voted upon the group. This discussion has dragged on for a long time > and, somehow, should be closed... I don't know what to say to this either..."dragged on' is quite a loaded term. Anyway, until someone can convincingly justify the removal of authors' mark-up then this issue still needs to be discussed, so I don't see the point in attempting to wind it up prematurely. Regards, Mark -- Mark Birbeck, formsPlayer mark.birbeck@x-port.net | +44 (0) 20 7689 9232 http://www.formsPlayer.com | http://internet-apps.blogspot.com standards. innovation.
Received on Monday, 19 March 2007 12:32:02 UTC