- From: Steve Harris <steve.harris@garlik.com>
- Date: Tue, 8 May 2012 10:17:32 -0700
- To: Andy Seaborne <andy.seaborne@epimorphics.com>
- Cc: Richard Cyganiak <richard@cyganiak.de>, public-rdf-wg@w3.org
+1, my guess is that it would mean there are not very conforming implementations, and an HTML datatype is useful without equality, for store and display, e.g. in a CMS. - Steve On 3 May 2012, at 02:27, Andy Seaborne wrote: > > > On 03/05/12 09:19, Richard Cyganiak wrote: >> Hi Andy, >> >> It sounds like you'd rather prefer an HTML datatype with a simple 1:1 >> correspondence between lexical space and value space. > > I think that's a viable approach, yes. > >> Your objection seems to be that something more complex isn't really >> needed. Which might be true, but do you think that something more >> complex would actually do any harm, and would be worse? > > I'm not objecting. > > I'm simply putting forward a case because I felt that the conversation was heading to infoset-value without much consideration of usage. > > The primary UC is passing around display fragments. Better dc:title. > > One (implementation) argument is that some systems only have DOM access. > Another is that other systems don't have an HTML5 parser at all. > > Given experiences of rdf:XMLLiterals, not just the fact they are hard-wired into RDF, it is not obvious, to me at least, that a complex scheme is a good idea. > >> And is this preference for a simpler scheme from an implementer's >> point of view, or is it from a WG resources/spec complexity point of >> view, or something else? > > Yes (implementation generally). > > If people in the WG want to spend time on infoset-value, that's fine. > > Andy > >> >> Thanks, Richard >> >> >> On 2 May 2012, at 21:47, Andy Seaborne wrote: >>> On 02/05/12 20:29, Richard Cyganiak wrote: >>>> On 2 May 2012, at 19:15, Andy Seaborne wrote: >>>>> I think I'm saying, start simple, prove a need for more >>>>> complicated. >>>>> >>>>> We can define a value space that is all character sequences >>>>> (and is disjoint from xsd:string). Do we need to be more >>>>> complicated? What's the use case? >>>> >>>> One use case might be RDFa parsers with HTML literal support. >>>> >>>> Let's say you have @datatype="rdf:HTMLLiteral" on some element, >>>> and the element contains text with markup, and the desire is that >>>> the resulting HTML literal contains the text with markup intact. >>>> >>>> Now the RDFa parser may not have access to the actual HTML >>>> string, but only to a representation that has already been parsed >>>> into a DOM tree. >>>> >>>> So the parser may have to serialize the DOM into a string, which >>>> would probably be different from the original string. >>> >>> Certainly something to consider. >>> >>> Thought: if the original string isn't available, does it matter? >>> Will it be available to anyone else? >>> >>>> >>>> (Or is this nonsense and the parser could always just do >>>> myDOMElement.innerHTML to get the original HTML?) >>> >>> I'm insufficiently up with the tool space to know. (gavin?) >>> >>>> >>>> Anyways, the advantage of having a value space that is isomorphic >>>> to the DOM is that you can parse and re-serialize the HTML and >>>> still get the same value. >>>> >>>>> (Not all RDF systems have access to info set support code now >>>>> that we are standardising Turtle and N-triples.) >>>> >>>> Yeah and that's why we're trying to change rdf:XMLLiteral to make >>>> it optional and to relax its lexical space. >>>> >>>> I imagine that rdf:HTMLLiteral would be optional too, and the >>>> lexical space should certainly be as unrestrictive as possible. >>>> >>>> Only those who want to compare HTML literals, or those who *need* >>>> to parse and re-serialize HTML literals, need to care what the >>>> value space is. (And yeah, if we can't come up with evidence that >>>> some systems need to do one of those, then there's little point >>>> in defining anything more complicated than a 1:1 L2V mapping.) >>> >>> Comparison may be done in another system - these literals are >>> published and ingested by another system that might be asked if two >>> literals are the same. e.g. a reasoner or a SPARQL engine. >>> Whether the ability to value-equals two literals with different >>> lexical forms is sufficiently important, I can't say. >>> >>> I feel that this isn't that likely - HTML5 literals are display >>> material to be passed about. For that, equality processing is >>> unlikely, and the fragments go in and come out on on some generated >>> HTML. >>> >>> Andy >>> >>> >>>> >>>> Best, Richard >>>> >>>> >>>> >>>>> >>>>> Andy >>>>> >>>>>> >>>>>> Ivan >>>>>> >>>>>>> Best, Richard >>>>>>> >>>>>>> >>>>>>> >>>>>>>>> And I guess in theory, DOMs and XML Infosets should be >>>>>>>>> isomorphic, no? >>>>>>>> >>>>>>>> In theory:-) To be checked. There may be corner cases. >>>>>>>> >>>>>>>>> >>>>>>>>> Between all these transformations, there should be >>>>>>>>> something that works for us. The devil is in the >>>>>>>>> details of course. >>>>>>>> >>>>>>>> Exactly... >>>>>>>> >>>>>>>>> >>>>>>>>> Or we could just avoid all of that trouble and simply >>>>>>>>> define the value space of the HTML datatype as >>>>>>>>> identical to the lexical space. >>>>>>>> >>>>>>>> And then we are back to the same issue as we had with >>>>>>>> XML Literals. Except that... there is no such thing as a >>>>>>>> formal canonical HTML5 >>>>>>>> >>>>>>>> Ivan >>>>>>>> >>>>>>>>> >>>>>>>>> Best, Richard >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Just some food for thoughts... >>>>>>>>>> >>>>>>>>>> Ivan >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On May 1, 2012, at 18:41 , Gavin Carothers wrote: >>>>>>>>>> >>>>>>>>>>> On Tue, May 1, 2012 at 6:46 AM, Richard >>>>>>>>>>> Cyganiak<richard@cyganiak.de> wrote: >>>>>>>>>>>> All, >>>>>>>>>>>> >>>>>>>>>>>> The 2004 WG worked under the assumption that the >>>>>>>>>>>> future of HTML was XHTML, and that the use case >>>>>>>>>>>> of shipping HTML markup fragments as RDF payloads >>>>>>>>>>>> would be addressed by rdf:XMLLiteral. But in >>>>>>>>>>>> 2012, shipping HTML fragments really means HTML5. >>>>>>>>>>>> Is rdf:XMLLiteral still adequate for this task? >>>>>>>>>>>> Is a new datatype with a lexical space consisting >>>>>>>>>>>> of HTML5 fragments needed? This question is >>>>>>>>>>>> ISSUE-63. >>>>>>>>>>>> >>>>>>>>>>>> I think it would be useful to have a straw poll >>>>>>>>>>>> sometime soon on this question: >>>>>>>>>>>> >>>>>>>>>>>> PROPOSAL: RDF-WG will work on an HTML datatype >>>>>>>>>>>> that would be defined in RDF Concepts. >>>>>>>>>>> >>>>>>>>>>> +1, and for internationalization should be a >>>>>>>>>>> required datatype, might also have a simple syntax >>>>>>>>>>> in Turtle (though would likely require a new last >>>>>>>>>>> call but a Web formating that doesn't understand >>>>>>>>>>> HTML doesn't seem like much of a web format) >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> If there is general support for this, then we >>>>>>>>>>>> could start work on the details of the datatype >>>>>>>>>>>> definition (lexical space, value space, L2V >>>>>>>>>>>> mapping and so on). >>>>>>>>>>>> >>>>>>>>>>>> All the best, Richard >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> ---- Ivan Herman, W3C Semantic Web Activity Lead >>>>>>>>>> Home: http://www.w3.org/People/Ivan/ mobile: >>>>>>>>>> +31-641044153 FOAF: >>>>>>>>>> http://www.ivan-herman.net/foaf.rdf >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> ---- Ivan Herman, W3C Semantic Web Activity Lead Home: >>>>>>>> http://www.w3.org/People/Ivan/ mobile: +31-641044153 >>>>>>>> FOAF: http://www.ivan-herman.net/foaf.rdf >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> ---- Ivan Herman, W3C Semantic Web Activity Lead Home: >>>>>> http://www.w3.org/People/Ivan/ mobile: +31-641044153 FOAF: >>>>>> http://www.ivan-herman.net/foaf.rdf >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >> > -- Steve Harris, CTO Garlik, a part of Experian 1-3 Halford Road, Richmond, TW10 6AW, UK +44 20 8439 8203 http://www.garlik.com/ Registered in England and Wales 653331 VAT # 887 1335 93 Registered office: Landmark House, Experian Way, NG2 Business Park, Nottingham, Nottinghamshire, England NG80 1ZZ
Received on Tuesday, 8 May 2012 17:18:01 UTC