W3C home > Mailing lists > Public > public-rdf-wg@w3.org > May 2012

Re: Adding a datatype for HTML literals to RDF (ISSUE-63)

From: Andy Seaborne <andy.seaborne@epimorphics.com>
Date: Wed, 02 May 2012 19:15:43 +0100
Message-ID: <4FA179CF.1020004@epimorphics.com>
To: public-rdf-wg@w3.org


On 02/05/12 16:18, Ivan Herman wrote:
>
> On May 2, 2012, at 16:40 , Richard Cyganiak wrote:
>
>> On 2 May 2012, at 15:35, Ivan Herman wrote:
>>>> HTML5 also defines the XHTML5 syntax,
>>>
>>> yes
>>>
>>>> and the spec for this includes an algorithm for serializing HTML DOMs to XHTML fragments or XHTML documents.
>>>
>>> Is this formally defined in the HTML5 document? I was looking for it, but I may have missed it.
>>
>> Here it is:
>> http://www.w3.org/TR/html5/the-xhtml-syntax.html#serializing-xhtml-fragments
>>
>> It doesn't spell out everything in lots of details, but just essentially says, “the result must be XML and isomorphic to the DOM tree”, and then spells out only the non-obvious corner cases. To me this seems clear enough.
>>
>
> Hm. Indeed. So we have
>
> HTML5  -- covered by the HTMN5 spec -->  DOM
> DOM    -- covered by the HTMN5 spec -->  XHTML5
> XHTML5 -- covered by the XML spec   -->  XML Infoset
>
> ie, there seem to be a chain of specs HTML5 --->  XML Infoset. A bit convoluted, but would work (and I prefer referring to external documents than spending our time defining these ourselves).
>
> The nice point is that we do not have yet another value space, just the same for XML Literal

So ... :-)

If in one place I write a literal ^^rdf:XMLLiteral that has info set X
and also write a literal ^^rdf:HTML5, that has value info set X

they are the same value.

Like "1"^^xsd:integer and "1"^^xsd:double?

Or is one a derived type of the other?
Like "1"^^xsd:integer and "1"^^xsd:decimal?

I think this is possibly not obvious nor useful - being HTML conveys the 
sense of the purpose is to be rendered, XML literals do not.

But two things that render the same can be different infosets so user 
expectations of "equality" may not be met.  e.g.
  <a href="..."><i>x</i></a>
and
  <i><a href="...">x</a></i>


I think I'm saying, start simple, prove a need for more complicated.

We can define a value space that is all character sequences (and is 
disjoint from xsd:string).  Do we need to be more complicated?  What's 
the use case?

(Not all RDF systems have access to info set support code now that we 
are standardising Turtle and N-triples.)

	Andy

>
> Ivan
>
>> Best,
>> Richard
>>
>>
>>
>>>> And I guess in theory, DOMs and XML Infosets should be isomorphic, no?
>>>
>>> In theory:-) To be checked. There may be corner cases.
>>>
>>>>
>>>> Between all these transformations, there should be something that works for us. The devil is in the details of course.
>>>
>>> Exactly...
>>>
>>>>
>>>> Or we could just avoid all of that trouble and simply define the value space of the HTML datatype as identical to the lexical space.
>>>
>>> And then we are back to the same issue as we had with XML Literals. Except that... there is no such thing as a formal canonical HTML5
>>>
>>> Ivan
>>>
>>>>
>>>> Best,
>>>> Richard
>>>>
>>>>
>>>>>
>>>>> Just some food for thoughts...
>>>>>
>>>>> Ivan
>>>>>
>>>>>
>>>>> On May 1, 2012, at 18:41 , Gavin Carothers wrote:
>>>>>
>>>>>> On Tue, May 1, 2012 at 6:46 AM, Richard Cyganiak<richard@cyganiak.de>  wrote:
>>>>>>> All,
>>>>>>>
>>>>>>> The 2004 WG worked under the assumption that the future of HTML was XHTML, and that the use case of shipping HTML markup fragments as RDF payloads would be addressed by rdf:XMLLiteral. But in 2012, shipping HTML fragments really means HTML5. Is rdf:XMLLiteral still adequate for this task? Is a new datatype with a lexical space consisting of HTML5 fragments needed? This question is ISSUE-63.
>>>>>>>
>>>>>>> I think it would be useful to have a straw poll sometime soon on this question:
>>>>>>>
>>>>>>> PROPOSAL: RDF-WG will work on an HTML datatype that would be defined in RDF Concepts.
>>>>>>
>>>>>> +1, and for internationalization should be a required datatype, might
>>>>>> also have a simple syntax in Turtle (though would likely require a new
>>>>>> last call but a Web formating that doesn't understand HTML doesn't
>>>>>> seem like much of a web format)
>>>>>>
>>>>>>>
>>>>>>> If there is general support for this, then we could start work on the details of the datatype definition (lexical space, value space, L2V mapping and so on).
>>>>>>>
>>>>>>> All the best,
>>>>>>> Richard
>>>>>>
>>>>>
>>>>>
>>>>> ----
>>>>> Ivan Herman, W3C Semantic Web Activity Lead
>>>>> Home: http://www.w3.org/People/Ivan/
>>>>> mobile: +31-641044153
>>>>> FOAF: http://www.ivan-herman.net/foaf.rdf
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>> ----
>>> Ivan Herman, W3C Semantic Web Activity Lead
>>> Home: http://www.w3.org/People/Ivan/
>>> mobile: +31-641044153
>>> FOAF: http://www.ivan-herman.net/foaf.rdf
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>
>
> ----
> Ivan Herman, W3C Semantic Web Activity Lead
> Home: http://www.w3.org/People/Ivan/
> mobile: +31-641044153
> FOAF: http://www.ivan-herman.net/foaf.rdf
>
>
>
>
>
>
Received on Wednesday, 2 May 2012 18:16:18 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:25:48 GMT