Re: summary of reification semantics issues (material for discussion). from Tim Berners-Lee on 2003-03-24 (w3c-rdfcore-wg@w3.org from March 2003)

From: Tim Berners-Lee <timbl@w3.org>
Date: Sun, 23 Mar 2003 19:45:36 -0500
To: Brian McBride <bwm@hplb.hpl.hp.com>
Cc: pat hayes <phayes@ai.uwf.edu>, Graham Klyne <GK@NineByNine.org>, Frank Manola <fmanola@mitre.org>, RDF Core <w3c-rdfcore-wg@w3.org>
Message-Id: <ED68D596-5D91-11D7-BCE4-000393914268@w3.org>
Summary: Still no evidence of de-re semantics in use, some evidence of 
de-dicto,
maybe we should re-charter to remove the reification form the RDF M&S.

On Wednesday, Mar 19, 2003, at 15:02 US/Eastern, Brian McBride wrote:

> At 11:29 19/03/2003 -0500, Tim Berners-Lee wrote:
>
> [...]
>
>
>>> Consider the following 2 graphs:
>>>
>>> Graph G1:
>>>
>>> _:s rdf:type rdf:Statement .
>>> _:s rdf:subject subj .
>>> _:s rdf:predicate pred .
>>> _:s rdf:object object .
>>> _:s foo:saidBy fred .
>>> _:s foo:saidIn doc1 .
>>>
>>> Graph G2:
>>>
>>> _:s rdf:type rdf:Statement .
>>> _:s rdf:subject subj .
>>> _:s rdf:predicate pred .
>>> _:s rdf:object object .
>>> _:s foo:saidBy john .
>>> _:s foo:saidIn doc2 .
>>>
>>> Merge the two graphs and then determine who said what, where.  If 
>>> the _:s nodes in each graph denote a statement (as opposed to a 
>>> stating), it is identified by its subject predicate and object 
>>> properties which would allow the two _:s nodes in each graph to be 
>>> merged.
>>
>> Yes, giving
>> _:s rdf:type rdf:Statement .
>> _:s rdf:subject subj .
>> _:s rdf:predicate pred .
>> _:s rdf:object object .
>> _:s foo:saidBy john .
>> _:s foo:saidIn doc1 .
>> _:s foo:saidIn doc2 .
>
> and also
>
> _:s foo:saidBy fred .
>
>
>
>>> The WG concluded that if reified statements denoted triples, rather 
>>> than occurrences of triples, the scenario above would lead to many 
>>> modelling errors and further confusion.
>>
>> I don't follow how they concluded that,
>
> because it is no longer possible to determine who said what where; the 
> association between (fred and doc1) and (john and doc2) has been lost.


Ah.   I see.  I had missed a triple. I understand the semantics of this 
now.
Yes, the group's system consistent. No it isn't one I would have chosen.
I do think a stating is a more complicated a concept to use as a basis,
and it can be constructed using statements, whereas to do the reverse
as Pat suggested is messy.  But I understand now the group's intended
semantics.  Thank you.

But the WG's argument was that statements led to muddled modeling errors
is nonsense, as is one really needs to create a the instance of a 
statement
one can

_:s rdf:type rdf:StatING .
_:s rdf:subject subj .
_:s rdf:predicate pred .
_:s rdf:object object .
_:s foo:saidBy john .
_:s foo:saidIn doc1 .

or rather, using statements too

_:t rdf:type rdf:StatING .
_:t  foo:ofStatement _:s
_:s rdf:subject subj .
_:s rdf:predicate pred .
_:s rdf:object object .
_:t foo:saidBy john .
_:t foo:saidIn doc1 .

Proof steps have similar things to _t:, defined by a statement given by 
the step, and the rule followed and variable bindings.  There may be 
many things like statings which are
relationship  to a triple, and I am not convinced that statings are 
fundamental or special.

((But in fact one doesn't need to use statings in practice:

_:s rdf:subject subj .
_:s rdf:predicate pred .
_:s rdf:object object .
_:s foo:saidIn doc1 .
_:t foo:creator john .

from which the above more complex things about statings can be deduced, 
and which allows the reuse of _s many times, and is more compact.))

So I am not making the argument that the WG's system is nonsense.

But I do not agree with the WG when it "concluded that if reified 
statements denoted triples, rather than occurrences of triples, the 
scenario above would lead to many modelling errors and further 
confusion."



>>  as the example above suffers from no confusion that I can see. The 
>> triple is stated by two files.
>> (Maybe I have misunderstood the way the WG uses
>> "statement" and "stating".  I assumed a statement means the abstract 
>> tripe, and a stating is
>> the fact that that triple occurs somewhere.)  Here "saidIn" expressed 
>> a stating,
>
> I have no idea what that last phrase means.

[I meant: The "stating", as I understand it, is the fact that a given 
triple occured in a given context. The "_:s foo:saidIn doc1" arc above, 
if we use statements, expresses this relationship.  It doesn't provide 
a node to hang other things from.]

(((Has anyone defined identity for these statings?
A statement is simple, in that two statements are the same statement if 
they have same parts  (subj, pred and obj).
Two statINGs, though, are the same if they have the same subj, pred, 
obj and ??document???.   Is identify of statings defined?
What happens if they are in different representations of the same 
document?
What is a document? What happens if two URIs refer to the same 
document?)))

>
>>  by
>> relating the document to a triple. Works fine as far as I can see -- 
>> and useful, to boot.
>
> If you are saying that there are other rational consistent choices the 
> WG could have made, then, speaking for myself, I do not dispute that.  
> I myself, would have made a different choice.  So it seems would you.  
> But that is not the point.
>
> I am having trouble understanding what position you are arguing for, 
> and on what grounds you are making the argument.  So far your argument 
> seems to consist of "I don't understand the WG's position so it must 
> be wrong".


No, it was "I don't see a use case for the WG's position".
That was my lack of understanding, and you must pardon my slowness.
One has now been given.  I don't find it compelling as an argument for
statements as opposed to statings.

> My first objective is to ensure that you at least understand why the 
> WG did what it did.  It seems to me that would put you in a stronger 
> position to disagree with it.

Well, if there was no explanation which made sense at all, that would 
have been an argument, too.

>
>>> I hope this example goes some way to persuading you that the WG is 
>>> not entirely off its trolley in making the proposal that it has.
>>
>> I can't say it does. Maybe we have all our terms backward or 
>> something. or maybe I have missed
>> something obvious above. If there is a modeling problem, then can you 
>> derive something ridiculous?
>
> Have I managed to be clearer this time?

Yes, and you and anyone reading this thread deserves a medal for 
patience.

>
>>> Concerning B, you note the current proposal is unsuitable for the 
>>> ways it has been used in cwm.  That may be so, and therein may lie a 
>>> clue that the representation of rules was not what it was designed 
>>> for.
>>>
>>> The WG was aware of issues such as the "{ }" mechanism in cwm, the 
>>> desire to represent graphs within  graphs and the notion of >>> contexts.
>>> It decided that this area was beyond the scope of its current 
>>> charter and has recorded an issue for consideration by a future WG:
>>
>> Indeed.  I don't expect the group to put in {} or the equivalent at 
>> this stage,
>> I was really explaining that my attempts to use reifications in the 
>> current
>> style of the spec didn't work, and I abandoned it - as implementation 
>> experience.
>
> Again, I am confused about the nature of this implementation 
> experience.

I needed quoting, and found that reification did not provide it.
The lack of quoting of the objects of rdf:subject etc  means that the 
reification cannot be used to implement quoting.  Quoting seemed to be 
the use case.
That doesn't mean reification is bad, it just means I was trying to 
figure out
what it is good for.

>  Please don't take this as being rude, but its the best way I can 
> think of to explain my misunderstanding.  If I take a hammer and try 
> to use it to drive a screw, I may well find this "implementation 
> experience" unsatisfactory, but is that the fault of the hammer.

Ok, I'm looking for the nails.
All I find is people using the hammer to drive in screws.....
but I'm still looking.

>>> As for C, dropping reification all together.  Reification does cause 
>>> confusion and the WG did consider this option, but we do know that 
>>> people use the current reification machinery.
>>
>> Apart from test cases, do we have some axioms or some evidence of 
>> what it is supposed to mean? Pointers?
>
> The formal semantic constraints on its meaning are given in the 
> semantics document:
>
>   http://www.w3.org/TR/rdf-mt/#Reif
>
> which also notes that further semantic constraints can be applied by 
> other layers.  The primer provides an informal guide to its use:
>
>   http://www.w3.org/TR/rdf-primer/#reification
>

Yes.   More or less the same english text. It seems to translate into

{ ?t rdf:subject ?s; rdf:predicate ?p; rdf:object ?o.} =>
{ log:forSome s2, p2, o2.
   s2 = ?s.  p2 =?p.  o2 = ?o.
    a soc:Work; log:semantics log:includes { ?s2 ?p2 ?o2}.
}

> You will find it used in the RDF schema for P3P:
>
>   http://www.w3.org/TR/p3p-rdfschema/#Appendix_RDF-XML

This uses rdf:predicate as a predicate, but I see no evidence that this 
is linked to the reification using rdf:id or bagid.

> Here partial reification is used to assert that web sites collect 
> statements of a particular form, as in:
>
>   http://www.w3.org/TR/p3p-rdfschema/#Example-2-2


I had trouble finding the use of reification, but I may need help with 
that.
The examples seemed to use vocabulary like p3p:data and p3p:statement
but the schema seems to define something else.  The Property 
rdf:predicate
is used to indicate the sort of information collected, but I couldn't 
find where
reification is used.  There is no use of it which relates to this 
question.
I'm looking for something which needs de-re semantics of reification.

"This Note has been written to meet the requirement that P3P 1.0 must 
have an RDF schema. It is not intended to be a normative specification. 
Instead, it represents a suggestion by the authors of one possible RDF 
schema for P3P. At the time of writing, the schema described here has 
not benefited from implementation experience."



>
>>>  The note on the RDF schema for P3P for example uses it (though I 
>>> doubt anyone uses the note) and in the jena project we know that 
>>> people use it because not only do we get support calls, but folks 
>>> asked for us to ensure we kept the Jena 1 optimisations supported in 
>>> Jena 2.
>>
>> optimizations? Got a pointer to the details of this? The user's email?
>
>   https://sourceforge.net/mailarchive/message.php?msg_id=2355386
>
> is the best I can do and on its own isn't terribly compelling.  The 
> sourceforge archive
>
>   https://sourceforge.net/mailarchive/forum.php?forum_id=8988
>
> unfortunately loses attachments :( something we didn't know at the 
> time.  On the client lists there are various discussions, e.g. a bug 
> report
>
>   http://groups.yahoo.com/group/jena-dev/message/2105
>
> Have a browse around and you'll get a feel for what comes up.

I will ... I'm offline now.  I'll look that up

> As an aside, experience with these other lists and archiving systems 
> illustrates just what a great job the systeam do with the mailing 
> lists and archives.
>

Thanks!  I have passed that on.

Let me give you another result of my investigations - I asked Eric 
Miller if he know of people who actually used reification.
It seems the OCLC  Cooperative Online Resource Catalog  system (now 
defunct) did use reification big time.   They really want to keep track 
of which catalogue source actually said what.  They use bagids a la
<rdf:Description rdf:about = &someresource; rdf:bagID=cataloger>
    <xx:yy> foo</>
</rdf:Desription>

They actually use the reified triples and actually rebuild data -- 
triples -- from the reifications, filtering on the cataloguer.
So, I popped the [superman] question: Suppose you had the Library of 
Congress cataloging something as written by someone names "Mark Twain", 
and the Popacatapetal High School Library cataloging it as written by 
someone named "Samuel Clemens".
Is it ok that the processing, on finding out that they were the same 
person, concluded that the LoC had catalogued the creator as "Samual 
Clemens".  No, apparently that would have been regarded as an error - a 
serious error. And yet it would follow directly from the
lack of quotation of the subject in the reification system

s1 rdf:subject  <#book14>.
s1 rdf:predicate dc:creator .
s1 rdf:object  <#SamClem>.
<#SamClem> owl:sameIndividualAs <#MarkTwain>.
{ ?x ?p ?y. ?y owl:sameIndividualAs ?z} => { ?x ?p ?z}.
______________________________________
s1 rdf:object  <#MarkTwain>.


Eric actually says, "I believe as chair of the early RDF M&S group that 
what the group meant by reification was really quoting". Eric Miller 
2003-03-20T18:30
So while I turned up an example of reification, it is a counter-example 
to the current spec.

>
>>> The WG compromised and decided try to marginalise reification to 
>>> "just another bit of vocabulary" as far as it can.
>>
>> The trouble is, a parser is required to output it when someone puts 
>> an ID on a statement.
>> And putting an ID on a statement may seem, to the uninitiated, to be 
>> a perfectly
>> reasonable thing to do.
>
> This sort of statement is very hard to deal with.  What trouble?  I 
> just don't know what you are trying to say.
>

Trouble? 1. Making RDF  parsers have to do all this reification stuff 
makes RDF much
more complicated.  RDF is supposed to be a mind-bogglingly simple lower
layer.  This sort of thing puts people off adopting it, clutters code, 
(not appreciated
on cellphones and embedded devices) and confuses people.

2. For example, the problem above.  People expect it do so something 
different,
and therefore run into trouble.
Like CORC, I found I needed quoting.  Like CORC, at first I assumed 
that that was what I got from reification.   I am still looking for a 
real example of someone who uses it as it is.

>
>>> It is not part of the concepts document and is mentioned in a low 
>>> key way in schema.  It has to be acknowledged that its special 
>>> treatment in the syntax means that it is singled out to some extent. 
>>>  But then, various interesting alternative approaches to RDF syntax 
>>> are gaining traction.
>>>
>>> Hopefully, careful explanation in the primer will minimise further 
>>> confusion.
>>
>> I would prefer to see it removed from parser conformance requirements 
>> to RDF M&S - or it will become much more difficult to weed out later.
>
> Again I'm not sure what that means - we are not updating M&S so how 
> can we 'remove' something to it - but it may be easy to meet, though 
> in a trivial sense.

Hmmm... maybe we should be rechartering to remove it.

> There are no parser conformance requirements.  We do not even define a 
> notion of parser as we define no processing model.

Very properly, too. What is defined is a test of test cases which 
define equivalent XML and ntriple files.  This is used as the benchmark 
for parser interop.   I've been putting cwm's parsers though them, and 
they have been very useful.

> We define a grammar, and we illustrate that grammar with test cases.  
> This is a device for augmenting the specs with specific, machine 
> testable examples.  The reification syntax is part of the grammar as 
> it was in M&S.  Unless we remove it, it seems sensible to provide 
> machine checkable examples so that those who do support it, support it 
> properly.

Yes... the only logical thing is to remove it, and it would be easier 
earlier than later, but would involve of course changing RDF M&S.

Tim
Attachments

text/enriched attachment: stored
Received on Sunday, 23 March 2003 19:58:51 UTC