Re: Another try. from Dan Brickley on 2012-03-01 (public-rdf-wg@w3.org from March 2012)

From: Dan Brickley <danbri@danbri.org>
Date: Thu, 1 Mar 2012 15:18:15 +0100
To: Pat Hayes <phayes@ihmc.us>
Cc: RDF-WG Group <public-rdf-wg@w3.org>
Message-ID: <CAFfrAFpim6h=t_FGgrsPy-vVRbFxJPKShs3051ceVjXR7mRsKA@mail.gmail.com>
On 29 February 2012 05:16, Pat Hayes <phayes@ihmc.us> wrote:
> On Feb 28, 2012, at 10:37 AM, Dan Brickley wrote:
>> On 28 February 2012 16:54, Pat Hayes <phayes@ihmc.us> wrote:
>>> First, it was abundantly clear from the very beginning of the RDF WG activity that RDF/S (and DAML/OIL and subsequently OWL) were understood to be timeless logical languages.
>>
>> Put it this way...
>>
>> The first public RDF Working Draft, http://www.w3.org/TR/WD-rdf-syntax-971002/
>>
>> <?namespace href="http://docs.r.us.com/bibliography-info" as="bib"?>
>> <?namespace href="http://www.w3.org/schemas/rdf-schema" as="RDF"?>
>> <RDF:serialization>
>>  <RDF:assertions href="http://www.bar.com/some.doc">
>>    <bib:author>
>>      <RDF:resource>
>>        <bib:name>John Smith</bib:name>
>>        <bib:email>john@smith.com</bib:email>
>>        <bib:phone>+1 (555) 123-4567</bib:phone>
>>      </RDF:resource>
>>    </bib:author>
>>  </RDF:assertions>
>> </RDF:serialization>
>>
>> While the document's author may be eternally the same, John's name,
>> email and phone are likely more volatile. While the schema's author
>> could've baked temporally-qualifying observations into the prose of
>> the property definition (e.g. 'current or former...'), in practice few
>> do this.
>
> Hmm. OK, but why is this, I wonder? Several possible answers. 1. Users might feel that many properties, while liable to change, in fact are stable enough that its worth treating them as timeless. (People's names dont change very often, and the cases that do arise (marriage, usually) are of a wellknown and kind of predictable sort.)  2. Users assume that the information is being recorded at a known time and will be timestamped somehow, and the necessary updating will be done semiautomatically (eg figuring out your current age from the recorded age and the date it was recorded.) 3. LIke 2, but users just dont care about the information getting old and decaying, because nothing important is going to be inferred from it. (FOAF age as opposed to age recorded by the SSA.) 4. Users just dont think about the issue at all, and arent even thinking about time-relative information versus stable timeless information.
>
> Any insight into which of these (or any others) is closest to the truth?

I think you're in the right area here. Lots of sites are
database-backed. So on the inside it might know my exact date of
birth; on the outside they give a vaguer age-in-years whenever a page
is requested. So I think because of the global near-instant
availability of information fresh from source, concerns about copies
going stale are often ignored. If you want the latest version, just
read it from the Web. But considerations vary a lot between domains,
as you point out. It's one thing for Myspace to call me 39 when I'm 40
already, ... quite another when we're dealing with complex
evidence-sharing amongst scientists. If I had to pick a single reason,
... it's that people just didn't think a lot about this, it didn't
cause enough people big enough problems (yet), and present-tense
properties can be convenient in other ways.

> BTW, I can attest that what one might call 'hard' users of ontological data, eg in bioinformatics and health sciences, are very much concerned about this issue and are getting tied in knots over it.

Yup, definitely. For e.g.
http://www.w3.org/2011/prov/wiki/WorkingDrafts and nearby

>> <?namespace href="http://www.nist.gov/RDFschema" as="NIST"?>
>> <?namespace href="http://www.w3.org/schemas/rdf-schema" as="RDF"?>
>> <RDF:serialization>
>>  <RDF:assertions href="John_Smith">
>>    <NIST:weight>
>>      <RDF:resource id="weight_001">
>>        <NIST:units href="#pounds"/>
>>        <RDF:PropValue>200</RDF:PropValue>
>>      </RDF:resource>
>>    </NIST:weight>
>>  </RDF:assertions>
>> </RDF:serialization>
>>
>> ... not to mention his weight.  (And I wish I weighed now what I
>> weighed in 1997.)
>>
>> Anyway, this pretty much set the tone for everything that followed, in
>> terms of RDF-in-practice.
>>
>> I know perfectly well you could've made us a lovely temporal logic or
>> whatever instead; but the RDF Core job wasn't to do that, but to come
>> up with a more formal story that covered as much as possible of
>> RDF-in-practice. Which it did, except for the aspect that people
>> stubbornly keep defining and using properties for stuff that changes,
>> even if the smallprint says not to. We went as far as we could without
>> creating another place to stuff information. It seems now we're
>> considering doing just that.
>
> Put another way, our technology has already created such a place for us, and we are considering making it official.

Yes!

Maybe this is the wrong thread to ask in, but I'm so backlogged I'm
not sure where else. A question: if we do give ourselves this extra
place for qualifiers, ... do you see any scope that property/value
pairs could live there too, rather than solely using it for
timestamps?

This came up in the schema.org discussions recently: suggestion that
almost any simple relationship (e.g. between an actor and movie they
star in) could usefully be (lowercase) reified. This situation
fragments the RDF vocabulary design world, since schema designer has
to guestimate in advance which properties will be worth qualifying,
and come up with an intermediate-entity-based-design instead.

Examples:
1.) So if you look at DBpedia (driven from Wikipedia),
something like
<http://dbpedia.org/page/The_Sixth_Sense>  dbp:starring
<http://dbpedia.org/resource/Bruce_Willis> .

...whereas the Freebase folk
(http://www.freebase.com/view/en/the_sixth_sense and
http://rdf.freebase.com/rdf/en.the_sixth_sense ... last time I looked
carefully) have represented this stuff using an extra level of
indirection. Neither is wrong/right. Often it's OK to know who starred
in the movie; often othertimes, you want to know the character name
etc.

It's very common to feel this pain of "ok, there's a basic
relationship here, but also some extra stuff..." and RDF doesn't do a
lot to help vocab and app designers.

2.) Addressbook formats
It's standard for addressbooks to allow users to keep notes about one
of their contacts and phone numbers. But consider this in RDF, and the
workflow of 'who decides what, when':

In my iphone right now, I brought up the "Pat Hayes" record and hit
'edit' on phone number and email address. In both cases, it gave a
list of commonly expected fields, but also lets me (usefully!) type in
custom values too. See
http://www.flickr.com/photos/danbri/6797635740/in/photostream

...this is another example where secondary aspects like 'mobile / home
/ work / main / home fax / page / [Add Custom Label], ...' are quite
naturally read as annotations on a basic central property,
'phoneNumber'. And such is common in e.g. XML representations. With
RDF, you have to choose up front whether this will be a 'phoneNumber'
property pointing to some kind of intermediate entity that is
decorated with info (home, fax, custom-stuff-here), ... or whether to
point straight to the number. And if so, whether to use the most
standard, well known property, or to replace it with the perhaps more
informative (and obscure) sub-property instead.

If we open up a place in RDF to stuff date qualifiers at fine grain,
my expectation is that we'll see a kind of gold rush, and we'll have
our customers saying, "OK, so I can now somehow qualify each of my RDF
claims by datestamp? That's lovely... can I keep some other notes
there too please? Pretty please?". And if it works in the tools they
care about, they'll probably just do it anyway, somehow.

Where should we draw the line? I really don't know... but I am
convinced this is a major frustration with RDF in many usage areas.

cheers,

Dan
Received on Thursday, 1 March 2012 14:18:50 UTC