- From: Andy Seaborne <andy.seaborne@epimorphics.com>
- Date: Wed, 02 May 2012 21:47:47 +0100
- To: Richard Cyganiak <richard@cyganiak.de>
- CC: public-rdf-wg@w3.org
On 02/05/12 20:29, Richard Cyganiak wrote: > On 2 May 2012, at 19:15, Andy Seaborne wrote: >> I think I'm saying, start simple, prove a need for more >> complicated. >> >> We can define a value space that is all character sequences (and is >> disjoint from xsd:string). Do we need to be more complicated? >> What's the use case? > > One use case might be RDFa parsers with HTML literal support. > > Let's say you have @datatype="rdf:HTMLLiteral" on some element, and > the element contains text with markup, and the desire is that the > resulting HTML literal contains the text with markup intact. > > Now the RDFa parser may not have access to the actual HTML string, > but only to a representation that has already been parsed into a DOM > tree. > > So the parser may have to serialize the DOM into a string, which > would probably be different from the original string. Certainly something to consider. Thought: if the original string isn't available, does it matter? Will it be available to anyone else? > > (Or is this nonsense and the parser could always just do > myDOMElement.innerHTML to get the original HTML?) I'm insufficiently up with the tool space to know. (gavin?) > > Anyways, the advantage of having a value space that is isomorphic to > the DOM is that you can parse and re-serialize the HTML and still get > the same value. > >> (Not all RDF systems have access to info set support code now that >> we are standardising Turtle and N-triples.) > > Yeah and that's why we're trying to change rdf:XMLLiteral to make it > optional and to relax its lexical space. > > I imagine that rdf:HTMLLiteral would be optional too, and the lexical > space should certainly be as unrestrictive as possible. > > Only those who want to compare HTML literals, or those who *need* to > parse and re-serialize HTML literals, need to care what the value > space is. (And yeah, if we can't come up with evidence that some > systems need to do one of those, then there's little point in > defining anything more complicated than a 1:1 L2V mapping.) Comparison may be done in another system - these literals are published and ingested by another system that might be asked if two literals are the same. e.g. a reasoner or a SPARQL engine. Whether the ability to value-equals two literals with different lexical forms is sufficiently important, I can't say. I feel that this isn't that likely - HTML5 literals are display material to be passed about. For that, equality processing is unlikely, and the fragments go in and come out on on some generated HTML. Andy > > Best, Richard > > > >> >> Andy >> >>> >>> Ivan >>> >>>> Best, Richard >>>> >>>> >>>> >>>>>> And I guess in theory, DOMs and XML Infosets should be >>>>>> isomorphic, no? >>>>> >>>>> In theory:-) To be checked. There may be corner cases. >>>>> >>>>>> >>>>>> Between all these transformations, there should be >>>>>> something that works for us. The devil is in the details of >>>>>> course. >>>>> >>>>> Exactly... >>>>> >>>>>> >>>>>> Or we could just avoid all of that trouble and simply >>>>>> define the value space of the HTML datatype as identical to >>>>>> the lexical space. >>>>> >>>>> And then we are back to the same issue as we had with XML >>>>> Literals. Except that... there is no such thing as a formal >>>>> canonical HTML5 >>>>> >>>>> Ivan >>>>> >>>>>> >>>>>> Best, Richard >>>>>> >>>>>> >>>>>>> >>>>>>> Just some food for thoughts... >>>>>>> >>>>>>> Ivan >>>>>>> >>>>>>> >>>>>>> On May 1, 2012, at 18:41 , Gavin Carothers wrote: >>>>>>> >>>>>>>> On Tue, May 1, 2012 at 6:46 AM, Richard >>>>>>>> Cyganiak<richard@cyganiak.de> wrote: >>>>>>>>> All, >>>>>>>>> >>>>>>>>> The 2004 WG worked under the assumption that the >>>>>>>>> future of HTML was XHTML, and that the use case of >>>>>>>>> shipping HTML markup fragments as RDF payloads would >>>>>>>>> be addressed by rdf:XMLLiteral. But in 2012, shipping >>>>>>>>> HTML fragments really means HTML5. Is rdf:XMLLiteral >>>>>>>>> still adequate for this task? Is a new datatype with >>>>>>>>> a lexical space consisting of HTML5 fragments needed? >>>>>>>>> This question is ISSUE-63. >>>>>>>>> >>>>>>>>> I think it would be useful to have a straw poll >>>>>>>>> sometime soon on this question: >>>>>>>>> >>>>>>>>> PROPOSAL: RDF-WG will work on an HTML datatype that >>>>>>>>> would be defined in RDF Concepts. >>>>>>>> >>>>>>>> +1, and for internationalization should be a required >>>>>>>> datatype, might also have a simple syntax in Turtle >>>>>>>> (though would likely require a new last call but a Web >>>>>>>> formating that doesn't understand HTML doesn't seem >>>>>>>> like much of a web format) >>>>>>>> >>>>>>>>> >>>>>>>>> If there is general support for this, then we could >>>>>>>>> start work on the details of the datatype definition >>>>>>>>> (lexical space, value space, L2V mapping and so on). >>>>>>>>> >>>>>>>>> All the best, Richard >>>>>>>> >>>>>>> >>>>>>> >>>>>>> ---- Ivan Herman, W3C Semantic Web Activity Lead Home: >>>>>>> http://www.w3.org/People/Ivan/ mobile: +31-641044153 >>>>>>> FOAF: http://www.ivan-herman.net/foaf.rdf >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>>> >>>>> ---- Ivan Herman, W3C Semantic Web Activity Lead Home: >>>>> http://www.w3.org/People/Ivan/ mobile: +31-641044153 FOAF: >>>>> http://www.ivan-herman.net/foaf.rdf >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>>> >>> >>> >>> ---- Ivan Herman, W3C Semantic Web Activity Lead Home: >>> http://www.w3.org/People/Ivan/ mobile: +31-641044153 FOAF: >>> http://www.ivan-herman.net/foaf.rdf >>> >>> >>> >>> >>> >>> >> >
Received on Wednesday, 2 May 2012 20:48:18 UTC