Re: pfps-04 (why the thread is germane to pfps-04)

Hello Peter,

Many thanks for your very clear and detailed explanations.

At 07:54 03/07/25 -0400, Peter F. Patel-Schneider wrote:

>This quesion is related to pfps-04 because pfps-04 is concerned with
>equality between XML literals in RDF.
>
>
>The root of this problem is that a complete treatment of XML literals in
>RDF needs a complete theory of equality for them.  This theory of equality
>cannot just determine equality between XML literals in RDF but also has to
>determine equality between XML literals and other objects in the RDF domain
>of discourse, in particular plain RDF literals and the value space for the
>XML Schema string datatype.

[This is a very general concern]
As far as I understand, RDF does not really mention XML Schema datatypes
in any normative way, so how would it normatively specify equivalences
to these datatypes? Also, what about other datatype systems that have
very similar constructs? A lot of datatype systems will have some kind
of 'string' type, and a lot of such systems will have some kind of
numeric types (which you mention below). What about these equivalences?


>Some of these answers can (now) be fairly easily determined from a simple
>perusal of the RDF documents and the canonicalization documents.
>
>Two XML literals are (now) equal in RDF precisely when their Exclusive
>XML Canonicalizations are the same octet sequence.

Okay. The equivalences would stay exactly the same if XML literals
would be represented a character sequences rather than as octet
sequences.


>However other answers are harder to determine.
>
>1/ When is an XML literal equal to a plain RDF literal?  A plain RDF
>literal is a Unicode string (sequence of Unicode characters), so this
>question boils down to whether octets and Unicode characters are disjoint.
>I found it difficult to answer this question, because of hints in the
>exclusive canonicalization document that they are not.

Can you point to the places where you saw such hints. If there are
such hints, then they definitely have to be fixed, and I'll make
sure that this happens.

Apart from that, it is very important to make sure that the plain
string "<br/>" (in XML written as "&lt;br/&gt;") is not the
same as the XML markup "<br/>" (in XML written as "<br/>").
So it is indeed important to make sure this question can easily
be answered.

However, I think it is absolutely inappropriate to solve this
problem by saying that one of them is characters and the other
is encoded in octets. If there is no other solution here than
with some kind of hack, I think it would be preferable to say
e.g. that characters in plain literals are green, and characters
representing XML literals are red. (and add a note to clarify
that green characters and red characters are not the same).


>2/ When is an XML literal equal to an XML Schema string?  This would appear
>to be the same as the previous question, as the value space for the XML
>Schema string datatype is Unicode strings, but there have been some
>comments from those involved in XML Schema that the values for XML Schema
>datatypes are more than just, for example, numbers.  In particular, there
>have been messages to the effect that in XML Schema decimal 1.5 is
>different from float 3E-1,

I guess this should read 15E-1 ?

>even though the first is defined as 15x10^(-1),
>where 15 is the integer 15 and 1 is the integer 1 (and 10 is the integer
>10), and the second is defined as 3x2^(-1), where 3 is the integer 3 and -1
>is the integer -1 (and 2 is the integer 2).
>
>I could probably dig up references for the above, but it would be
>considerable work.  If anyone is really interested just ask, and I'll get
>around to it soon.

I agree that this is an important question. I don't really need the
references, I can understand both the advantages and the problems of
such a position. I'm surprised to see that this still isn't clear;
I would have assumed that the RDF Core WG and the XML Schema WG
would have clarified this quite some time ago.


Regards,     Martin.

Received on Friday, 25 July 2003 16:21:42 UTC