Re: Subjects as Literals from Pat Hayes on 2010-07-02 (semantic-web@w3.org from July 2010)

From: Pat Hayes <phayes@ihmc.us>
Date: Thu, 1 Jul 2010 21:51:07 -0500
To: Peter Ansell <ansell.peter@gmail.com>
Cc: Semantic Web <semantic-web@w3.org>
Message-Id: <77807AB4-21B8-4E71-A2ED-71216C98FD10@ihmc.us>
On Jul 1, 2010, at 5:03 PM, Peter Ansell wrote:

> On 2 July 2010 04:56, Pat Hayes <phayes@ihmc.us> wrote:
>>
>> On Jul 1, 2010, at 5:21 AM, Peter Ansell wrote:
>>
>>> On 1 July 2010 13:14, Pat Hayes <phayes@ihmc.us> wrote:
>>>>
>>>> On Jun 30, 2010, at 8:14 PM, Ross Singer wrote:
>>>>
>>>>> I suppose my questions here would be:
>>>>>
>>>>> 1) What's the use case of a literal as subject statement (besides
>>>>> being an academic exercise)?
>>>>
>>>> A few off the top of my head.
>>>>
>>>> 1. Titles of books, music and other works might have properties  
>>>> such as
>>>> the
>>>> date they were registered, who owns them, etc..
>>>> 2. Dates may have significant properties such as being the day that
>>>> someone
>>>> was shot or when war broke out.
>>>> 3. Dates represented as character strings in some known date  
>>>> format other
>>>> than XSD can be asserted to be the same as a 'real' date by writing
>>>> things
>>>> like
>>>>
>>>> "01-02-1481" sameDateAs "01022010"^^xsd:date .
>>>> "01-02-1481" isDateIn :MuslimCalendar .
>>>>
>>>> I am sure that you can think of many more. In general, allowing  
>>>> strings
>>>> as
>>>> subjects opens the door to a wide range of uses of RDF to 'attach'
>>>>  information to pieces of text. Another example which occurs to  
>>>> me: this
>>>> piece of text is the French translation of that piece of text,  
>>>> expressed
>>>> as
>>>> a single RDF triple with two literals.
>>>
>>> If you are working with datasets where you just need to know  
>>> explicit
>>> facts, and not who said anything about the facts, this may be  
>>> useful.
>>
>> Well, that is a good sketch of the way that RDF is intended to be  
>> used. It
>> doesn't have any very advanced machinery for keeping track of who  
>> said what
>> about what facts. It jsut records the facts. (I know it has  
>> reification....)
>
> It does depend on how it interprets facts though...

For that, read the RDF semantics document.

>
>>> You will run into issues if you accidentally use the string
>>> "01-02-1481" or you import another set of triples that gave that
>>> string a different meaning, like the barcode of a computer for  
>>> example
>>> and the implication that the date was the barcode of a computer.
>>
>> Why would you? A given character string may indeed have several  
>> properties
>> and interpretations. The word "chat" has one meaning in English and a
>> different one in French, but its the same string of four letters in  
>> both
>> cases.
>
> What is your interpretation of the third triple in this sequence? Does
> someoneElse have an opinion about cats (french semantic meaning) or
> chatting (english semantic meaning), or should the model imply that
> cats and chatting are equivalent? To me it seems like it is not
> logical to choose one interpretation randomly over the other. There is
> no difference between this issue and letting Literals become Subjects,
> as the key motivator for my argument is that an instance of a Literal
> in one triple should not affect other triples where it appears
> (currently just as the Object)
>
> <me> <likesTo> "chat" (Literal1) .
> <frenchFriend> <likes> "chat" (Literal2) .
> <someoneElse> <hasOpinionAbout> "chat" (Literal3) .

Oh dear. OK, we have to back up a little here. This RDF, as written,  
seems broken.  A literal in RDF - lets stick to simple, untyped,  
literals for now, as the points all generalize to typed literals - is  
a string that refers to itself, ie it *means* the actual string. So,  
the RDF literal "chat" means the character string of four letters see- 
aitch-ay-tee. That meaning is fixed by the actual RDF specification,  
and cannot be altered. So, your first triple says that a relation  
called likesTo holds between me and this string "chat". It does NOT  
say something analogous to the English sentence "I like to chat",  
which does not refer to that string at all, but actually *uses* it as  
an English word. Similarly, the French sentence "Mon ami aimez le  
chat" does not refer to the string "chat" either, but uses it as a  
French word. So again, your second triple doesnt mean anything at all  
like that French sentence, but instead says that a relation called  
"likes" holds between frenchFriend and the string "chat". Similarly  
the third triple.

>
> You are seeming to say that instance of "chat" (Literal3) in the third
> triple should inherit information about frenchFriend likes "chat"
> (noun) (Literal2) but also that me likesTo "chat" (verb) (Literal1).

Yes, because (although you apparently did not intend this) , the RDF  
specs themselves require, as a normative constraint, that these  
literals all refer to the same thing, viz. the string "chat".

> If Literals should be merged between triples an RDF processor will
> have to accept both interpretations of triple 3 concurrently, while I
> would just accept all three triples separately as three separated
> graphs without logical confusion because the concept of a shared
> Literal wasn't in my interpretation of RDF.

Then your interpretation of RDF is broken, I am afraid.

>
> Even if you go back to the bnode model for the first two triples, as
> shown below, all of the triples are merged in some way, and you can't
> semantically distinguish bnode1 from bnode2 to decide which version of
> "chat" is the correct interpretation for the opinion statement. As in
> the case above, if you aren't sharing literals between triples, then
> Literal3 is semantically separated from both of the other literals,
> and there is no confusion about what each of the statements mean.
>
> <me> <likesTo> _:bnode1 .
> _:bnode1 sameas "chat" (Literal1).
> <frenchFriend> <likes> _:bnode2 .
> _:bnode2 sameas "chat" (Literal2) .
> <someoneElse> <hasOpinionAbout> "chat" (Literal3) .
>
> The fact that RDF has language annotations should have no effect on
> this argument, as it is incidental, and the argument could map to any
> string literals that don't have types and one could find examples of
> semantic inconsistencies that an algorithm could never fix
> consistently.
>
>>> If
>>> you are going to work at the web level then you will get a new set  
>>> of
>>> issues surrounding what literals should actually be merged.
>>
>> Merged? You can merge two literals only if they are the exact same  
>> literal,
>> in which case they are already 'merged' in the RDF graph model.
>
> Sorry for any confusion. I was under the impression that literals were
> not merged, so the triples using the same literal were not actually
> related in a single conceptual graph. I don't agree that it is the
> right way, re the issues that it brings, but if that is the way the
> specification says it works then we have to stick with it. If RDF is
> really just supposed to be a general data description model like JSON
> then we wouldn't have to worry about the semantic conflict between
> shared Literals anyway.
>
> That changes the entire conversation if RDF was *always* designed that
> Literals should be merged based on co-occurence. If it is already that
> way then some people have been misinterpreting it. It wouldn't
> actually require a material change to the specification for Literals
> to become Subjects as well as Objects if they are already used for
> chaining Triples together.
>
> Why not do it today so the false interpretation (non-Literal-merging)
> of the RDF specification doesn't keep filtering through and people can
> start fixing their legacy RDF software? Then any actual RDF users
> could start telling customers that they need to be very careful about
> what words they type into their software as the database may develop
> errors if they use the same word twice in different contexts. It may
> be very upsetting for users to later figure out that they suddenly had
> opinions about cats after a french friend joined the community because
> they liked chatting in the past.

The mistake here is to presume that simple character strings in RDF  
are being used as though they were words. But this is such a basic  
error that I doubt if anyone who holds it is going to be able to use  
RDF successfully in any case.

Pat

>
> Cheers,
>
> Peter
>
>

------------------------------------------------------------
IHMC                                     (850)434 8903 or (650)494 3973
40 South Alcaniz St.           (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Friday, 2 July 2010 02:52:10 UTC