- From: Boris Motik <boris.motik@comlab.ox.ac.uk>
- Date: Fri, 14 Nov 2008 23:38:01 -0000
- To: "'W3C OWL Working Group'" <public-owl-wg@w3.org>
Hello, At the last teleconf I was tasked to investigate whether we should include the rdf:XMLLiteral datatype into OWL 2. Here are the results of my findings. There are no principal technical problems with including rdf:XMLLiteral into OWL 2. If we choose to do so, we should make the value space of rdf:XMLLiteral disjoint with the value spaces of all other datatypes (and of various string variants as well). Furthermore, we should not provide any facets on the datatype. Under such a definition, the datatype always has an infinite value space, so it does not cause problems for reasoning. I am not convinced, however, that this datatype is all that useful. In fact, the datatype's definition seems to contain a feature that may pose a significant hurdle to the practical usage of the datatype. The definition of the lexical space from http://www.w3.org/TR/rdf-concepts/#dfn-rdf-XMLLiteral says the following: The lexical space is the set of all strings: which are well-balanced, self-contained XML content [XML]; for which encoding as UTF-8 [RFC 2279] yields exclusive Canonical XML (with comments, with empty InclusiveNamespaces PrefixList ) [XML-XC14N]; for which embedding between an arbitrary XML start tag and an end tag yields a document conforming to XML Namespaces [XML-NS] It defines the value space of the datatype as being in a one-to-one relationship with the lexical space. Now I believe that the second condition actually poses significant hurdles to practical usage of the datatype, as it requires XML lexical values to be canonicalized. This means that, for example, the following literal is syntactically incorrect: "<a/>"^^rdf:XMLLiteral The canonical form of XML embedded in this literal is <a></a>, so this is what you are supposed to write if you want to produce syntactically valid lexical values of rdf:XMLLiteral. The canonicalization process is quite complex, and most quite "reasonable" XML documents are not in canonical form. This means that you cannot use rdf:XMLLiteral to represent most reasonable XML fragments. Given this situation, I'm really wondering whether really need this datatype in OWL 2. It would introduce an implementation hurdle (implementations would need to check whether all literals are correctly typed, and to do this they must implement the complex canonicalization process) without an obvious benefit. Furthermore, I wonder if there is an OWL 1 implementation that correctly implements this datatype (I would strongly suspect that there is none). Finally, since the datatype map of OWL 2 is open to extensions, implementations are free to implement this datatype if they really need it. The latter is just my opinion; undoubtedly you'll let me know what yours is :-) Regards, Boris
Received on Friday, 14 November 2008 23:38:44 UTC