Re: XMLLiteral and HTML from Richard Light on 2013-12-12 (public-rdf-comments@w3.org from December 2013)

From: Richard Light <richard@light.demon.co.uk>
Date: Thu, 12 Dec 2013 13:59:33 +0000
To: Richard Cyganiak <richard@cyganiak.de>
CC: "public-rdf-comments@w3.org" <public-rdf-comments@w3.org>
Message-ID: <52A9C145.4000405@light.demon.co.uk>
On 10/12/2013 07:59, Richard Cyganiak wrote:
> Dear Richard,
>
> Thank you for you comment on the RDF 1.1 Concepts document.
>
> You question the purpose of the XMLLiteral and HTML datatypes, and 
> raise concerns about their implementation cost. Let me try to address 
> both questions.
>
> The purpose of both datatypes is to enable text with markup in HTML 
> graphs. The XMLLiteral datatype was added to the original 2004 spec 
> due to i18n requirements (e.g., bidirectional text, mixed-language 
> text, and Ruby markup). This datatype is now widely deployed for a 
> number of use cases, and removing it is realistically no longer possible.
>
> Since XHTML has not seen the adoption that was expected back in the 
> days of the previous WG, the HTML datatype has now been added as a 
> more author-friendly alternative that addresses the same requirements.
>
> The only RDF-WG specification that requires an XML parser for a 
> conforming implementation is RDF/XML. There are no conformance 
> criteria on any of the other documents that require an XML parser or 
> HTML parser.
>
> Implementing, for example, graph equivalence over these datatypes 
> would require such a parser, but no entailment regime requires that 
> these datatypes be recognised. Simpler put, the datatypes are 
> optional. Implementations may elect to not support them, which means 
> they simply treat these datatypes like any other unrecognised 
> datatype: as strings that carry a marker for a certain syntax.
>
> Implementing XMLLiteral in RDF 1.1 is considerably easier than before 
> because the requirement for XML canonicalisation has been removed.
>
> The most natural way to associate HTML or XML resources with an RDF 
> graph is perhaps not what you propose, but something more like this:
>
>  <example.com/mydocument.xml <http://example.com/mydocument.xml>> 
> dc:format "text/xml".
>
> This has been possible since RDF 2004.
>
> Please respond to this message and let us know whether this addresses 
> your concerns.
Richard,

Thank you for your detailed and helpful reply.  I can confirm that it 
sets my mind at rest as regards the issues I raised.  In particular, the 
fact that supporting parsing of the datatypes is optional removes my 
main concern.

Best wishes,

Richard

>
> All the best,
> Richard
>
>
>
> On 4 Dec 2013, at 15:39, Richard Light <richard@light.demon.co.uk 
> <mailto:richard@light.demon.co.uk>> wrote:
>
>> Hi,
>>
>> Following an interesting exchange about the fate of the RDF API [1] 
>> over on public-lod, I have just had a look through the RDF 1.1 
>> Concepts CR document [2] to bring myself up to date on the core RDF 
>> standard.
>>
>> There it is noted that the rdf:HTML and rdf:XMLLiteral datatypes may 
>> be made non-normative.
>>
>> Although (being an XML-head) I was pleased when I discovered some 
>> time ago that you can validly dump chunks of XML into an RDF 
>> resource, on reading the spec afresh I do wonder what business chunks 
>> of XML and HTML have, to be floating around in the innards of an RDF 
>> graph.  Wouldn't the RDF model be significantly easier to implement 
>> if they were removed?  To support them, you presumably have to bring 
>> XML and HTML parsing capabilities into your core RDF engine, together 
>> with suitable DOMs to hold the result of parsing.  I'm assuming that 
>> no-one is suggesting that the semantic payload of these embedded 
>> resources is in any way relevant to the RDF graph.
>>
>> I can see that it can be argued that these are just "special string 
>> types", but surely there is an order of magnitude of difference 
>> between interpreting a date datatype from its lexical space to its 
>> lexical value, and parsing an XML document fragment?
>>
>> Surely the natural way to associate HTML and XML resources with an 
>> RDF graph is to point to them with a URI?  What you could do, is to 
>> invent an RDF mechanism for "linked document type" which is analogous 
>> to datatypes for literal values. Then you could express a node as e.g.:
>>
>> <example.com/mydocument.xml 
>> <http://example.com/mydocument.xml>>^^<http://www.w3.org/TR/REC-xml/>
>> or:
>> <example.com/mydocument.xml 
>> <http://example.com/mydocument.xml>>^^"text/xml"
>>
>> and then have a loose coupling between the RDF engine and the 
>> processing (or not) of the linked resource.  By including a linked 
>> document type, you also enable the use of content negotiation, so 
>> that variant forms can be retrieved from the same URL.
>>
>> This generalized mechanism would mean that other content types (e.g. 
>> JSON) could be supported without a further extension to the RDF 
>> Concepts recommendation being required.
>>
>> Richard
>>
>> [1] http://www.w3.org/TR/rdf-api/
>> [2] http://www.w3.org/TR/rdf11-concepts/
>>
>> -- 
>> *Richard Light*

-- 
*Richard Light*
Received on Thursday, 12 December 2013 14:00:11 UTC