Re: Adding a datatype for HTML literals to RDF (ISSUE-63)

On 2 May 2012, at 06:22, Ivan Herman wrote:

> A technical question, though.
> 
> For XML Literals, we have a nice definition (proposal) that the value space consists of XML Infosets, and that means we can say whether two literals should be considered identical.
> 
> What is the equivalent notion in HTML5? I have had a very short chat with Mike Smith (staff contact at the HTML5 WG) and he has not seen any formal definition on when would two HTML5 fragment be considered as identical. Any bright ideas here?

Probably only when they have identical lexical representations?

> The HTML5 spec goes in great detail on how an HTML5 document/fragment should be parsed into a DOM. That even handles cases when the HTML5 source is invalid. So... is there a formal definition on when two DOM trees are identical? Maybe it is obvious (at first glance it looks like it...) and we could say that the value space consists of (HTML5) DOM trees.

I would very much like to keep any process of that complexity out of RDF.

- Steve

> Which leads to another issue: *if* we define HTML5 that way, ie, relying on the identity of DOM Trees, maybe it is worth re-thinking the XML Literal case and use the same mechanism. Just for the sake of consistency....
> 
> Just some food for thoughts...
> 
> Ivan
> 
> 
> On May 1, 2012, at 18:41 , Gavin Carothers wrote:
> 
>> On Tue, May 1, 2012 at 6:46 AM, Richard Cyganiak <richard@cyganiak.de> wrote:
>>> All,
>>> 
>>> The 2004 WG worked under the assumption that the future of HTML was XHTML, and that the use case of shipping HTML markup fragments as RDF payloads would be addressed by rdf:XMLLiteral. But in 2012, shipping HTML fragments really means HTML5. Is rdf:XMLLiteral still adequate for this task? Is a new datatype with a lexical space consisting of HTML5 fragments needed? This question is ISSUE-63.
>>> 
>>> I think it would be useful to have a straw poll sometime soon on this question:
>>> 
>>> PROPOSAL: RDF-WG will work on an HTML datatype that would be defined in RDF Concepts.
>> 
>> +1, and for internationalization should be a required datatype, might
>> also have a simple syntax in Turtle (though would likely require a new
>> last call but a Web formating that doesn't understand HTML doesn't
>> seem like much of a web format)
>> 
>>> 
>>> If there is general support for this, then we could start work on the details of the datatype definition (lexical space, value space, L2V mapping and so on).
>>> 
>>> All the best,
>>> Richard
>> 
> 
> 
> ----
> Ivan Herman, W3C Semantic Web Activity Lead
> Home: http://www.w3.org/People/Ivan/
> mobile: +31-641044153
> FOAF: http://www.ivan-herman.net/foaf.rdf
> 
> 
> 
> 
> 
> 

-- 
Steve Harris, CTO
Garlik, a part of Experian 
1-3 Halford Road, Richmond, TW10 6AW, UK
+44 20 8439 8203  http://www.garlik.com/
Registered in England and Wales 653331 VAT # 887 1335 93
Registered office: Landmark House, Experian Way, NG2 Business Park, Nottingham, Nottinghamshire, England NG80 1ZZ

Received on Tuesday, 8 May 2012 17:14:19 UTC