Re: An analysis of whether we should include rdf:XMLLiteral into OWL 2 (ACTION-244)

Boris,

Regarding the technical problem you raise: yes, the canonicalization may
be an implementation problem. On the other hand, a number of RDF
environment have already handled this (eg, Jena does), and there are,
afaik, public domain XML canonicalization tools around, ie,
implementations are not forced to start from scratch. In other words, it
is not as bad as it sounds.

On the other hand, by not including that datatype, we would create a
backward incompatibility for OWL 1 which explicitly list rdf:XMLLiteral
as built in[1,2]. I think we should definitely avoid doing that.
Besides, XML Literals are actually used in RDF applications and OWL 2
should acknowledge that. I understand that implementation may add their
extensions including these datatypes as well, but that does send a very
different message and expectations. For me, these issues outweight the
extra difficulties that implementations may have (which, I believe, are
probably way smaller than the implementations of other datatypes that we
have already accepted).

I have no problem separating the value space and the lack of facets, I
think that is certainly a safe way to move forward. I would propose
that, with that extra restriction, we should accept XMLLiterals as part
of the built ins in OWL2.

Ivan

[1]
http://www.w3.org/TR/2004/REC-owl-semantics-20040210/syntax.html#owl_built_in_datatypes
[2] http://www.w3.org/TR/2004/REC-owl-semantics-20040210/rdfs.html#5.2

Boris Motik wrote:
> Hello,
> 
> At the last teleconf I was tasked to investigate whether we should include the rdf:XMLLiteral datatype into OWL 2. Here are the
> results of my findings.
> 
> There are no principal technical problems with including rdf:XMLLiteral into OWL 2. If we choose to do so, we should make the value
> space of rdf:XMLLiteral disjoint with the value spaces of all other datatypes (and of various string variants as well). Furthermore,
> we should not provide any facets on the datatype. Under such a definition, the datatype always has an infinite value space, so it
> does not cause problems for reasoning.
> 
> 
> I am not convinced, however, that this datatype is all that useful. In fact, the datatype's definition seems to contain a feature
> that may pose a significant hurdle to the practical usage of the datatype. The definition of the lexical space from
> 
> http://www.w3.org/TR/rdf-concepts/#dfn-rdf-XMLLiteral
> 
> says the following:
> 
>   The lexical space
>     is the set of all strings:
>        which are well-balanced, self-contained XML content [XML];
>        for which encoding as UTF-8 [RFC 2279] yields exclusive Canonical XML (with comments, with
>            empty InclusiveNamespaces PrefixList ) [XML-XC14N]; 
>        for which embedding between an arbitrary XML start tag and an end tag yields a document
>            conforming to XML Namespaces [XML-NS]
> 
> It defines the value space of the datatype as being in a one-to-one relationship with the lexical space.
> 
> Now I believe that the second condition actually poses significant hurdles to practical usage of the datatype, as it requires XML
> lexical values to be canonicalized. This means that, for example, the following literal is syntactically incorrect:
> 
> "<a/>"^^rdf:XMLLiteral
> 
> The canonical form of XML embedded in this literal is <a></a>, so this is what you are supposed to write if you want to produce
> syntactically valid lexical values of rdf:XMLLiteral.
> 
> The canonicalization process is quite complex, and most quite "reasonable" XML documents are not in canonical form. This means that
> you cannot use rdf:XMLLiteral to represent most reasonable XML fragments.
> 
> 
> Given this situation, I'm really wondering whether really need this datatype in OWL 2. It would introduce an implementation hurdle
> (implementations would need to check whether all literals are correctly typed, and to do this they must implement the complex
> canonicalization process) without an obvious benefit. Furthermore, I wonder if there is an OWL 1 implementation that correctly
> implements this datatype (I would strongly suspect that there is none). Finally, since the datatype map of OWL 2 is open to
> extensions, implementations are free to implement this datatype if they really need it.
> 
> The latter is just my opinion; undoubtedly you'll let me know what yours is :-)
> 
> Regards,
> 
>  Boris
> 
> 

-- 

Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
PGP Key: http://www.ivan-herman.net/pgpkey.html
FOAF: http://www.ivan-herman.net/foaf.rdf

Received on Saturday, 15 November 2008 09:09:54 UTC