Re: PRIMER: draft data model section from Bill de hOra on 2001-10-17 (w3c-rdfcore-wg@w3.org from October 2001)

From: Bill de hOra <bdehora@interx.com>
Date: Wed, 17 Oct 2001 03:29:22 +0100
To: <w3c-rdfcore-wg@w3.org>
Message-ID: <002b01c156b3$894cc8d0$01000001@MITCHUM>
Hi Pat,

lots of questions...


Pat Hayes:
>>
You say there:
"Clearly in RDF processing, literals are opaque to RDF (i.e. if a
literal
looks like a URI or more RDF, it is treated as a literal, not as a URI
or more RDF). However an application using RDF triples and able to
properly manipulate Dublin Core in your example would be entitled to
draw the inference that the literal names an entity."

I want to take issue with this, because to me it seems to contradict 
itself. As you say, to an RDF processor, literals are opaque. But you 
then say that an 'application using RDF triples' would be entitled to 
draw an inference which requires (?) NOT treating literals as opaque 
if they refer to the Dublin core. Is this application supposed to be 
an RDF application or not? If it is, then it is not entitled to make 
that inference, since there is nothing in the RDF that enables it to 
draw it. If on the other hand it is able to draw that conclusion, 
then it evidently is not using information (only) from the RDF 
triples, so why should an account of the RDF meanings of the triples 
be concerned with this information?
>>

When I say opaque, I mean a parser cannot treat the literal as anything
other than a literal. What the application/engine/person does with the
triple is its own business. I appreciate that a clean delineation
between parsers and engines isn't always the case. 

A strawman. With my A. Haxor hat on, I read the DC creator description
in order to write some code for a dodgy KM module which tries lifts a
literal from bytes/string/Unicode/whatever we decide literals are, into
denoting a person that has a URI denoting them. So the code's job is to
bind dc creator values to RDF  resources. Someone might say 'Bill, what
on earth is this code doing? You should get the assumptions out of the
code and write them down'. I could point to my code and say, 'But it
works, the worst that happens is that it says it couldn't find anyone,
and anyway, Dublin Core didn't write their assumptions down either, it's
not my fault we use it'. 

Maybe that's not an RDF application, and that's a kind of realpolitik
hacking we wish to discourage or someday make unnecessary, but I
conjecture there will lots of abductive code like this for some years to
come for the express purpose of scraping over existing data: it's the
stuff of business models. Old data never dies, it just gets integrated.


Pat:
>>
I think this is particularly tricky as an example since the part of 
the Dublin core that you quote is written in English, so presumably 
is not fully understandable by any software yet written by anyone.
>>

I think any examples working out of natural language are tricky. Are
there better examples we could use for the Primer?


Pat:
>> 
>Bill:
>[Indeed most of the examples I've seen of literal values have literals
>naming an entity in a common-sense way. Yes that's folkware semantics
>but there's nothing wrong with common-sense just because computers
don't
>have any.

There is if you are touting the formalism as something for computers 
to use. In fact it is dishonest and foolish, and would be valid 
grounds for having ones tenure case dismissed at any reputable 
university department.
>>

I'm tempted to quote Groucho Marx. I was referring to common-sense as an
aid to following the example in the primer, not attributing magic powers
to RDF. In the meantime someone's going to have to explain to me how we
get from common-sense examples to RDF triples in this primer without
some form of conjuring. There _is_ magic dust being sprinkled over
Frank's examples and the ones in the M&S. What exactly is wrong with
doing this as long as we point that out? 


Pat:
>> 
>Bill:
>In any case it's up to the modeller (in this case, the primer)
>to make the semantics explicit. Surely that's what RDF is for?]
>
>Anyway. There is _nothing_ in RDF to make us think that when a property
>has a literal value the property means the literal itself is the entity
>and not what the value might denote

True, but what it can denote is severely restricted. The model theory 
at present assumes that literals have a *fixed value*. So whatever a 
literal denotes, it has to denote it once and for all, globally. It 
can't denote one thing in the Dublin core and something else 
somewhere else.
>>

I'm not suggesting that a literal should denote different things. I'm
suggesting its reasonable for a parser to treat a literal as a blob and
an application not to. As far as I'm concerned a parser's job is to get
literals into working memory as literals; what's done in working memory
with those literals, is another matter. Maybe they get served up to a
person who realises the literal is actually someone they know, maybe
there's a bug in the application code that gets the literal lifted,
maybe someday someone will specify DC creator more formally that will
allow machinery to infer that a literal hanging off dc:creator can stand
for a person. Lifting literals via some kind of (I don't know,
abductive?) inference is something people are going to want to do with
RDF.


Pat:
>>
Now, that is a very simple and restrictive assumption, and Peter 
Patel-Schneider wants us to relax it to the extent of allowing 
datatype schemas for literals. The kind of examples he uses are where 
"070801" might mean either a date or a string or an integer, and one 
can determine from things like the rdfs:range information which 
datatype scheme is supposed to be applied to each literal occurrence. 
That is the scheme we have now worked out in full detail so that it 
COULD be incorporated into the MT for RDFS, if we want to do so, 
though right now its just kind of waiting to be put there. But even 
with this extension, the range of things that a literal can refer to 
is somewhat restricted to be the kinds of things that one finds in 
datatyping schemes. I'm not sure if having a literal like "creator" 
denote a particular human being would count as a datatyping scheme, 
but it sure doesn't sound like one to me.
>>

I'm confused here. Are you saying that you can't soundly get from a
literal to a thing in the world? If not, was there ever any point in
basing DC creator on RDF?


Pat:
>>
>In Frank's example, without clear semantics for 'creator' there isn't
>enough information to make a decision on what's being implied.

Right, and indeed that semantic content cannot be stated in RDF. I 
don't think we should pretend that it can if in fact it cannot.
>>

Does that invalidate Frank's and the M&S' examples? If it doesn't then I
don't understand why and if it does how are we going to bootstrap this
primer for oikes like myself? 


Pat:
>>
I think we should be very careful not to give the impression that we 
are saying that RDF can represent natural language meanings. It would 
be irresponsible, or just plain silly, to claim that 
existential-conjunctive logic can represent every meaning expressible 
in natural language. (If you think it can, then try publishing that 
claim in a refereed journal.)
>>

I know enough not to claim that. But that seems to make using any
natural language example to pluck out a triple in the primer RDF snake
oil. 

regards,
Bill

.. 
InterX 
bdehora at interx.com 
+44(0)20-8817-4039 
www.interx.com
Received on Tuesday, 16 October 2001 22:32:01 UTC