Re: 2001-09-07#5 Literals from Tim Berners-Lee on 2001-10-23 (w3c-rdfcore-wg@w3.org from October 2001)

From: Tim Berners-Lee <timbl@w3.org>
Date: Tue, 23 Oct 2001 16:30:02 -0400
To: "Pat Hayes" <phayes@ai.uwf.edu>
Cc: <w3c-rdfcore-wg@w3.org>
Message-ID: <008e01c15c01$7e747a90$84001d12@w3.org>
----- Original Message -----
From: "Pat Hayes" <phayes@ai.uwf.edu>
To: "Tim Berners-Lee" <timbl@w3.org>
Cc: <w3c-rdfcore-wg@w3.org>
Sent: Monday, October 22, 2001 7:27 PM
Subject: Re: 2001-09-07#5 Literals


> >Short version: "SHOULD" in literal equality considered dangerous.
> >
> >Long version:
> >
> >From
>
>http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2001Sep/att-0341/01-lite
r
> >als.txt
> >in
> >http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2001Sep/0341.html
> >
> >  [4]
> >  When comparing two RDF Literals, their Unicode strings MUST be
> >  equal for the RDF Literals to compare as equal. If both Literals
> >  have language tags, these tags MUST be equal for the Literals to
> >  be considered equal. If two Literals are found with equal Unicode
> >  strings but only one has a language tag, the Literals SHOULD NOT
> >  be considered equal.
> >
> >
> >The idea of any "SHOULD" in the definition of equality for literals for
any
> >language
> >give me the creeps.  A weakness at the bottom of the foundation tends to
> >imply
> >bad things for the rest.  I agent A's assertion that  x=y  can be
translated
> >into
> >a belief by agent B that x may or not equal y depending on how agent A's
> >author
> >read the spec, then I find it difficult to see how we would build
anything
> >else
> >on top of it.  I would suggest to the group come clean one way or the
> >other - if language tags are in, make their comparison mandatory.
> >
> >For example, I could see this becoming a security hole in anything built
> >on top of RDF.  Silly example: I give my name (labeled as French) to pay
for
> >something - the merchant checks it against my card, ignoring the
> >language tag difference, but later I can't be made to pay the bill
> >as the debt was by someone whose name was not my name.
> >[Insert more realistic example here].
>
> I hope that couldn't possibly happen. The 'equality' being discussed
> above is SYNTACTIC equality, ie whether or not those two strings
> should be treated as the very same string. (It would be better to
> call it something else, "identity" or something.) So for example is
> "Paris" in French the same string as "Paris" in English? (Its not
> pronounced the same way, for example.) But whether you say it is or
> it isn't, it still *means* the same French city in French or in
> English.
(The example I use is "chat".  ((we actually had an email
adderss for feiendly team discussion chat@w3.org and had a problem
when Mlle Catherine Chat joined the Paris office and wanted chat@w3.org)))
> Similarly you, the guy with the card debt, might well have
> more than one name. So the only case that might be a problem would be
> where the same character sequence had a different meaning in two
> different languages and some dumb engine thought that the two strings
> were identical. But then it would clearly be the engine's fault for
> making unreasonable assumptions :-)


No, it would be the fault of the standard which said it SHOULD treat
the strings as syntacticly different but didn't have to!


Never mind higher forms of equivalence -- if the whole language is
built on something which has ambiguity in syntactic identity of literals,
then  the whole thing is weak.

Remember that a vast amount, of Semantic Web deployment
will be as glue between existing systems which work in other
languages.  The bank's systems and the calendar system and the
programming langauges and database systems they are written
have ASCII strings or Unicode
strings, but they don't have (lan,g,ustring) pairs as fundamental
data types.  So in most real cases, the name is going to matched
up with something which has no langauge tag.  So the database join
for example, will be using a language-tag-independent notion of
equality.   But then all the RDF-based systems will be built
using language-inclusive identity.  What of the things built on
top? Of DAML equiavalnce as used in cardinality etc?

Tim

> Pat
> --
> ---------------------------------------------------------------------
> IHMC (850)434 8903   home
> 40 South Alcaniz St. (850)202 4416   office
> Pensacola,  FL 32501 (850)202 4440   fax
> phayes@ai.uwf.edu
> http://www.coginst.uwf.edu/~phayes
>
Received on Tuesday, 23 October 2001 16:30:04 UTC