Re: Publish RDF Concepts as revised WD? (was: Re: Agenda 16 May telecon) from Steve Harris on 2012-05-17 (public-rdf-wg@w3.org from May 2012)

From: Steve Harris <steve.harris@garlik.com>
Date: Thu, 17 May 2012 16:48:59 +0100
To: Ivan Herman <ivan@w3.org>
Cc: Richard Cyganiak <richard@cyganiak.de>, Guus Schreiber <guus.schreiber@vu.nl>, RDF WG <public-rdf-wg@w3.org>
Message-Id: <9DB8AEFA-A4B7-4C26-A9CC-27003C79A724@garlik.com>

On 2012-05-17, at 16:27, Ivan Herman wrote:

> Steve,
> 
> I have hit a problem today, when implementing the HTML literal in my RDFa distiller software. The problem is that the underlying library does not give me a method whereby I can take the original file's element content from its corresponding DOM node and dump it into a string (that can be used for the RDF Literal). (Technically, the library does not implement the innerHTML attribute on the DOM Element Node.) I can only take the content and dump it into a string in full XML syntax.
> 
> So, for example, if I have, in HTML5+RDFa:
> 
> <div property="ex:something" datatype="rdf:HTML"><p>Inner</div>
> 
> then the only way I can generate an HTML Literal is 
> 
> <> ex:something "<p>Inner</p>"^^rdf:HTML .
> 
> Another library used by another tool may have the necessary method. For the same RDFa fragment it can generate:
> 
> <> ex:something "<p>Inner"^^rdf:HTML .
> 
> With the current definition of the HTML Literal both RDFa tools are correct and compliant: the HTML Literals, though lexically different, are identical in terms of the generated DOM, based on the HTML5 specification. Ie, if a triple store implements the identity in the value space, there would be no duplication of triples if both tools dump their content into the same space. If only lexical identity is available then... it becomes messy.

Sure, but what usecase do you have for being able to want to compare these - even though though lexically different - semantically? I know you can always do STR(?x) = STR(?y) && datatype(?x) = datatype(?y) && datatype(?x) = xsd:HTMLLiteral or whatever if you need to force the issue, but I still don't see a case where you'd need those two things to compare equal at the semantic level.

> I think comparing this to the problems around reification or bnodes is a bit of an exaggeration...:-)

Certainly. I was just trying to illustrate the problem. Many of us (I include myself) have been using these tools for way too long, and I don't think we appreciate how complex we've made it for people coming into this area.

I (occasionally) have to explain out tech stack to new developers coming in, it's not getting any easier.

I can imagine struggling to explain why:

<x> <p> '''<p class="foo" id="bar">baz'''^^xsd:HTMLLiteral .
<y> <p> "<p id='bar' class='foo'>baz</p>"^^xsd:HTMLLiteral .

SELECT *
WHERE {
  ?s <p> ?o .
  FILTER(?o = "<p id='bar' class='foo'>baz</p>"^^xsd:HTMLLiteral)
}

returned two results.

On the other hand, an HTML5PARSE() function, that takes an HTMLLiteral, and normalises it following the HTML5 rules, makes perfect sense* (though I'm still not sure how useful it is), and has less odd from a interop point of view, as you'd get an error if the store didn't implement HTML5PARSE() and you tried to use it.

You don't (typically, there may be exceptions) get an error if you import data containing ^^xsd:HTMLLiteral into a store that doesn't implement that datatype.

* … FILTER(HTML5PARSE(?x) = HTML5PARSE(?y)) is more obvious than the STR() trick above, IMHO.

- Steve

-- 
Steve Harris, CTO
Garlik, a part of Experian
1-3 Halford Road, Richmond, TW10 6AW, UK
+44 20 8439 8203  http://www.garlik.com/
Registered in England and Wales 653331 VAT # 887 1335 93
Registered office: Landmark House, Experian Way, Nottingham, Notts, NG80 1ZZ

Received on Thursday, 17 May 2012 15:49:31 UTC