- From: Dan Brickley <danbri@danbri.org>
- Date: Thu, 1 Mar 2012 15:18:15 +0100
- To: Pat Hayes <phayes@ihmc.us>
- Cc: RDF-WG Group <public-rdf-wg@w3.org>
On 29 February 2012 05:16, Pat Hayes <phayes@ihmc.us> wrote: > On Feb 28, 2012, at 10:37 AM, Dan Brickley wrote: >> On 28 February 2012 16:54, Pat Hayes <phayes@ihmc.us> wrote: >>> First, it was abundantly clear from the very beginning of the RDF WG activity that RDF/S (and DAML/OIL and subsequently OWL) were understood to be timeless logical languages. >> >> Put it this way... >> >> The first public RDF Working Draft, http://www.w3.org/TR/WD-rdf-syntax-971002/ >> >> <?namespace href="http://docs.r.us.com/bibliography-info" as="bib"?> >> <?namespace href="http://www.w3.org/schemas/rdf-schema" as="RDF"?> >> <RDF:serialization> >> <RDF:assertions href="http://www.bar.com/some.doc"> >> <bib:author> >> <RDF:resource> >> <bib:name>John Smith</bib:name> >> <bib:email>john@smith.com</bib:email> >> <bib:phone>+1 (555) 123-4567</bib:phone> >> </RDF:resource> >> </bib:author> >> </RDF:assertions> >> </RDF:serialization> >> >> While the document's author may be eternally the same, John's name, >> email and phone are likely more volatile. While the schema's author >> could've baked temporally-qualifying observations into the prose of >> the property definition (e.g. 'current or former...'), in practice few >> do this. > > Hmm. OK, but why is this, I wonder? Several possible answers. 1. Users might feel that many properties, while liable to change, in fact are stable enough that its worth treating them as timeless. (People's names dont change very often, and the cases that do arise (marriage, usually) are of a wellknown and kind of predictable sort.) 2. Users assume that the information is being recorded at a known time and will be timestamped somehow, and the necessary updating will be done semiautomatically (eg figuring out your current age from the recorded age and the date it was recorded.) 3. LIke 2, but users just dont care about the information getting old and decaying, because nothing important is going to be inferred from it. (FOAF age as opposed to age recorded by the SSA.) 4. Users just dont think about the issue at all, and arent even thinking about time-relative information versus stable timeless information. > > Any insight into which of these (or any others) is closest to the truth? I think you're in the right area here. Lots of sites are database-backed. So on the inside it might know my exact date of birth; on the outside they give a vaguer age-in-years whenever a page is requested. So I think because of the global near-instant availability of information fresh from source, concerns about copies going stale are often ignored. If you want the latest version, just read it from the Web. But considerations vary a lot between domains, as you point out. It's one thing for Myspace to call me 39 when I'm 40 already, ... quite another when we're dealing with complex evidence-sharing amongst scientists. If I had to pick a single reason, ... it's that people just didn't think a lot about this, it didn't cause enough people big enough problems (yet), and present-tense properties can be convenient in other ways. > BTW, I can attest that what one might call 'hard' users of ontological data, eg in bioinformatics and health sciences, are very much concerned about this issue and are getting tied in knots over it. Yup, definitely. For e.g. http://www.w3.org/2011/prov/wiki/WorkingDrafts and nearby >> <?namespace href="http://www.nist.gov/RDFschema" as="NIST"?> >> <?namespace href="http://www.w3.org/schemas/rdf-schema" as="RDF"?> >> <RDF:serialization> >> <RDF:assertions href="John_Smith"> >> <NIST:weight> >> <RDF:resource id="weight_001"> >> <NIST:units href="#pounds"/> >> <RDF:PropValue>200</RDF:PropValue> >> </RDF:resource> >> </NIST:weight> >> </RDF:assertions> >> </RDF:serialization> >> >> ... not to mention his weight. (And I wish I weighed now what I >> weighed in 1997.) >> >> Anyway, this pretty much set the tone for everything that followed, in >> terms of RDF-in-practice. >> >> I know perfectly well you could've made us a lovely temporal logic or >> whatever instead; but the RDF Core job wasn't to do that, but to come >> up with a more formal story that covered as much as possible of >> RDF-in-practice. Which it did, except for the aspect that people >> stubbornly keep defining and using properties for stuff that changes, >> even if the smallprint says not to. We went as far as we could without >> creating another place to stuff information. It seems now we're >> considering doing just that. > > Put another way, our technology has already created such a place for us, and we are considering making it official. Yes! Maybe this is the wrong thread to ask in, but I'm so backlogged I'm not sure where else. A question: if we do give ourselves this extra place for qualifiers, ... do you see any scope that property/value pairs could live there too, rather than solely using it for timestamps? This came up in the schema.org discussions recently: suggestion that almost any simple relationship (e.g. between an actor and movie they star in) could usefully be (lowercase) reified. This situation fragments the RDF vocabulary design world, since schema designer has to guestimate in advance which properties will be worth qualifying, and come up with an intermediate-entity-based-design instead. Examples: 1.) So if you look at DBpedia (driven from Wikipedia), something like <http://dbpedia.org/page/The_Sixth_Sense> dbp:starring <http://dbpedia.org/resource/Bruce_Willis> . ...whereas the Freebase folk (http://www.freebase.com/view/en/the_sixth_sense and http://rdf.freebase.com/rdf/en.the_sixth_sense ... last time I looked carefully) have represented this stuff using an extra level of indirection. Neither is wrong/right. Often it's OK to know who starred in the movie; often othertimes, you want to know the character name etc. It's very common to feel this pain of "ok, there's a basic relationship here, but also some extra stuff..." and RDF doesn't do a lot to help vocab and app designers. 2.) Addressbook formats It's standard for addressbooks to allow users to keep notes about one of their contacts and phone numbers. But consider this in RDF, and the workflow of 'who decides what, when': In my iphone right now, I brought up the "Pat Hayes" record and hit 'edit' on phone number and email address. In both cases, it gave a list of commonly expected fields, but also lets me (usefully!) type in custom values too. See http://www.flickr.com/photos/danbri/6797635740/in/photostream ...this is another example where secondary aspects like 'mobile / home / work / main / home fax / page / [Add Custom Label], ...' are quite naturally read as annotations on a basic central property, 'phoneNumber'. And such is common in e.g. XML representations. With RDF, you have to choose up front whether this will be a 'phoneNumber' property pointing to some kind of intermediate entity that is decorated with info (home, fax, custom-stuff-here), ... or whether to point straight to the number. And if so, whether to use the most standard, well known property, or to replace it with the perhaps more informative (and obscure) sub-property instead. If we open up a place in RDF to stuff date qualifiers at fine grain, my expectation is that we'll see a kind of gold rush, and we'll have our customers saying, "OK, so I can now somehow qualify each of my RDF claims by datestamp? That's lovely... can I keep some other notes there too please? Pretty please?". And if it works in the tools they care about, they'll probably just do it anyway, somehow. Where should we draw the line? I really don't know... but I am convinced this is a major frustration with RDF in many usage areas. cheers, Dan
Received on Thursday, 1 March 2012 14:18:50 UTC