- From: Ivan Herman <ivan@w3.org>
- Date: Thu, 17 May 2012 17:27:21 +0200
- To: Steve Harris <steve.harris@garlik.com>
- Cc: Richard Cyganiak <richard@cyganiak.de>, Guus Schreiber <guus.schreiber@vu.nl>, RDF WG <public-rdf-wg@w3.org>
Steve, I have hit a problem today, when implementing the HTML literal in my RDFa distiller software. The problem is that the underlying library does not give me a method whereby I can take the original file's element content from its corresponding DOM node and dump it into a string (that can be used for the RDF Literal). (Technically, the library does not implement the innerHTML attribute on the DOM Element Node.) I can only take the content and dump it into a string in full XML syntax. So, for example, if I have, in HTML5+RDFa: <div property="ex:something" datatype="rdf:HTML"><p>Inner</div> then the only way I can generate an HTML Literal is <> ex:something "<p>Inner</p>"^^rdf:HTML . Another library used by another tool may have the necessary method. For the same RDFa fragment it can generate: <> ex:something "<p>Inner"^^rdf:HTML . With the current definition of the HTML Literal both RDFa tools are correct and compliant: the HTML Literals, though lexically different, are identical in terms of the generated DOM, based on the HTML5 specification. Ie, if a triple store implements the identity in the value space, there would be no duplication of triples if both tools dump their content into the same space. If only lexical identity is available then... it becomes messy. I think comparing this to the problems around reification or bnodes is a bit of an exaggeration...:-) Cheers Ivan On May 17, 2012, at 16:45 , Steve Harris wrote: > On 2012-05-16, at 16:18, Ivan Herman wrote: >> >> On May 16, 2012, at 14:17 , Steve Harris wrote: >>>> >>>> Regarding ISSUE-63, HTML datatype: I think this is a Good Thing and will be popular. The most contentious point is the definition of the value space. It appears complex (DOM DocumentFragment nodes), but in reality it makes implementation of conforming parser *simpler* because it allows them to produce any of a number of equivalent results. The complexity only affects systems that decide to implement value-based comparison for HTML literals, something that is entirely optional. I expect few or no systems to do it. >>> >>> I see limited utility in being able to do = comparisons on non-lexically identical HTML5 fragments. But I may well be missing some important usecases. >> >> If I look at >> >> http://lists.w3.org/Archives/Public/public-rdf-comments/2012Apr/0000.html >> >> the attributes added to an HTML Literal have a 'semantic' role (controlling translations, for example) and these may result in an RDF graph (eg, via RDFa). Interaction with the user, running beautifying tools, etc, may change the Literal fragments without changing the intended semantics (ie, by retabulating the fragment to make it pretty...). Having a clear way on comparing the literals without just lexically comparing them looks really important. >> >> I realize this is a bit vague, but the point is that the definition makes HTML Literals more robust against non-essential changes when combined with RDFa. > > Yeah, so that bit I get, but I just can't ground it in a concrete use-case. > > I can imagine for e.g. a CMS wanting to use RDF for storage, but then non-lexical comparison isn't important. > > There is an interoperability cost to specifying a complex value-space comparison, so I don't think we should take the decision lightly. OTOH if there's a really good use-case for non-lexical comparison, it would be annoying to have specified a lexical comparison. > > In my opinion, we as a community have a very bad track record* in making these decisions the right way first time - and we tend to go for the more complex wrong options, rather than the simpler wrong options, in my opinion. > > * bNodes, XMLLiteral, rdf:List, containers, reification, plain literals, … I'll stop before I get more controversial :) > > - Steve > > -- > Steve Harris, CTO > Garlik, a part of Experian > 1-3 Halford Road, Richmond, TW10 6AW, UK > +44 20 8439 8203 http://www.garlik.com/ > Registered in England and Wales 653331 VAT # 887 1335 93 > Registered office: Landmark House, Experian Way, Nottingham, Notts, NG80 1ZZ > ---- Ivan Herman, W3C Semantic Web Activity Lead Home: http://www.w3.org/People/Ivan/ mobile: +31-641044153 FOAF: http://www.ivan-herman.net/foaf.rdf
Received on Thursday, 17 May 2012 15:24:08 UTC