[RIF-RDF] (potential) issues regarding correspondence of identifiers from Jos de Bruijn on 2007-09-03 (public-rif-wg@w3.org from September 2007)

From: Jos de Bruijn <debruijn@inf.unibz.it>
Date: Mon, 03 Sep 2007 12:30:36 +0200
To: RIF <public-rif-wg@w3.org>
Message-ID: <46DBE24C.1090209@inf.unibz.it>

Dear all,

I have come across a number of (potential) issues in the correspondence
between identifiers in RDF and identifiers in RIF. The issues are all
discussed in the blue 'discussion' blocks in the 'Syntax' section of the
RDF compatibility document [1].

The issues are the following:

a) RDF URI references vs absolute IRIs
There is a slight differences between the set of RDF URI references and
the set of absolute IRIs, which is the symbol space of rif:iri. In fact,
the set of absolute IRIs is a superset of the set of RDF URI references
(see RFC3987). However, conversion might be necessary for retrieval
purposes; if this is the case, we might need to add a note here.
I do not know whether or in which cases such conversions be necessary.
Any suggestions?

b) RDF plain literals versus XML schema strings
An open question (for me) is what the exact differences are between the
value spaces of the RDF plain literals without language tags and
xsd:string. The value space of RDF plain literals without language tags
consists of all Unicode strings. Both in the current specification of
XML schema datatypes and in the current working draft of XML schema 1.1
data types the value space of the string datatype is restricted to the
sequences of Unicode characters excluding the surrogate blocks, FFFE,
and FFFF. There are some further differences between the specification
of the string datatype in XML schema 1.0 and XML schema 1.1; in the
former case, the datatype is based on the Char production in XML 1.0; in
the latter case, the datatype is based on the Char production in XML 1.1.
An important question is what to do with plain literals which contain
characters which are not in the lexical space of xsd:string.

c) typed literals in RDF versus typed constants in RIF
There is a difference in the treatment of typed literals (s, u) where
the lexical form s is not part of the lexical space of the datatype
identified by u, called an ill-typed literal. In RIF, such typed
literals are syntactical errors whenever u corresponds to a datatype or
symbol space supported or defined by RIF, whereas in RDF such typed
literals are not syntactical errors, and are interpreted as abstract
objects which are not literals. In case u does not identify a datatype
or symbol space supported or defined by RIF, then the symbol is not a
syntactical error and the interpretations in RIF and RDF correspond.

Because of the difference in interpretation of ill-typed literals, the
proposal is to define the corresponding URI for every ill-typed literal.
A canonical definition of this corresponding URI enables the
reconstruction of the original ill-typed literal, e.g. for
round-tripping. Agreement needs to be reached on what this URI looks
like. The proposal is:
http://www.w3.org/2005/rif/rdf-ill-typed-literal/uri-encode("s"^^u),
where uri-encode is a function which appropriately encodes a typed
literal; this function is to be defined.

Best, Jos

[1]
http://www.w3.org/2005/rules/wg/wiki/Core/RIF-RDF_Compatibility#head-b69c9dca9dcd4e0e42bac0a21ce72c9c05579d09

--
Jos de Bruijn debruijn@inf.unibz.it
http://www.debruijn.net/
----------------------------------------------
As far as the laws of mathematics refer to
reality, they are not certain; and as far as
they are certain, they do not refer to
reality.
-- Albert Einstein

Received on Monday, 3 September 2007 10:30:50 UTC