- From: Dan Brickley <danbri@danbri.org>
- Date: Thu, 13 Oct 2011 15:21:13 +0100
- To: Pat Hayes <phayes@ihmc.us>
- Cc: Richard Cyganiak <richard@cyganiak.de>, RDF Working Group WG <public-rdf-wg@w3.org>
On 13 October 2011 14:29, Pat Hayes <phayes@ihmc.us> wrote: > On Oct 13, 2011, at 6:10 AM, Richard Cyganiak wrote: > Indeed, and that was DELIBERATE. A contextual logic (in the sense you are using it) simply does not work as a Web logic. For some discussion of this point, see http://www.ihmc.us/users/phayes/IKL/GUIDE/GUIDE.html#LogicForInt . In fact, a contextual logic does not work for ontologies in general. If the truth of an assertion depends on the context in which it is asserted, and if this context is not available when it is read, then it is USELESS. Or maybe worse than useless. Are you suggesting it is really practical and feasible for every assertion to be so explicit as to never need a 'best-before' date? Particularly in such a nuance-free language like RDF, I find this hard to believe. We can go the slippery slope towards only ever describing events, since their descriptions don't go stale, but in an open world (where relevant facts may always be missing), the utility of having a big pile of event descriptions is often questionable. >> Many of our problems stem from that. >> >> I'll give examples. >> >> :G2010 {:alice :age 29.} >> :G2011 {:alice :age 30.} >> >> Individually, each of those graphs are true (at a certain point in time). If taken together, an inconsistency is inferred (assuming :age is a functional property): >> >> :alice :age 29, 30. >> >> By merging the two graphs, we have discarded the contextual information. > > In RDF, that "contextual information" was never there in the first place. This is BAD RDF. You may as well call the Web "bad"; but it's not going away. And nor is simple factual data published in Web pages --- a big use case for our stuff. Practical example: (repeating something just aired during the F2F/telecon) * in early FOAF stuff we tried to urge people towards decontextualised data that won't go stale. So for example here, to describe date of birth / events, rather than 'age'. * FOAF now has age? Why --- because Peter Mika asked for it, because he was involved with sites (e.g. MySpace) who are publishing the 'age' of users in HTML. * Should we be mailing MySpace and telling them to publish date/year of birth instead of age? Maybe it'd be good for The Youth to be forced to do more mental arithmetic? But standards != advocacy; we can't fix the world from a committee. * with the rise of RDFa (and microformats, microdata etc) many factual assertions will come from such (database-driven) sites. So "bad RDF" is perhaps not the most helpful perspective here. Is there any value in going from sites publishing stuff like <p>Dan is 39</p> to <p typeof="Person"><a href="http://danbri.org/" rel="homepage" property="firstName">Dan</a> is <span rel="age">30</a></p> ? ... I think so. But it puts work onto the consumer of the data: we need to remember where we got it. And maybe a whole pile of other info too. Anyone doing data aggregation is familiar with such requirements, even if they are hard to express in logical languages. This doesn't make either bad; but we have work to do bridging between the logical and data-hacking perspectives. And maybe this also puts some work onto the RDF community: that we should make some experiments (yes, research + hacking, not standards) around annotating properties, to indicate that our property 'age' is more """volatile""" than our property 'dateOfBirth'. And perhaps even specifically that 'age' goes stale relatively quickly (in whatever level of detail suits application demands). For some as-yet-undocumented notion of """volatile""". >> This shows that the graph merge operation is *not truth-preserving* – not *valid* in the formal sense – *if* the merged graphs have different contexts. > > No, it shows that they don't have contexts. Graph merging is truth preserving, precisely because RDF is *not* a contextual logic. RDF is not a contextual logic; it is and should remain a simple minded language that can be used to make fairly basic assertions about a/the world. RDF's cartoon universe has no notion of time nor change. However the people using RDF have to build systems that bridge this simplified perspective back into our real lives, software applications, ever-changing datasets etc., where time and change are constantly messing with us. This is (as I think Richard articulated quite nicely) at the heart of our problem. RDF's worldview is super-super-simplified. To live with this simplicity, we need some tricks, techniques and so on. What we have to figure out, is which of those tricks and techniques are (something like) data-hacking folklore and which can be specified using the other instruments of W3C committee-dom, namely testcases, computer languages, semantics specs and so on. It will do us no good at all to just stand here and say "don't use properties like 'age' ...". What we can say is "if you use properties like 'age', ... consider managing and sharing your data with the following conventions.". This theme btw underpins some of my concerns with Sandro's advocacy for a simple "I got these triples from this IRI" version of WebArch-for-SemWeb. In too many real-world scenarios, we'll want to keep a whole packet of information telling us where a bunch of data came from. And it might have come from the same basic IRI several times under varying circumstances. (Specs like http://www.w3.org/TR/HTTP-in-RDF10/ are a good start at keeping that "how I transacted with the Web, and what I got back" data diary.) All this doesn't mean that data can only ever be considered in contexts. Just that we need to get better, much better, at providing all kinds of hints to help application developer and consuming apps flatten things down from contextualised and quoted representations, into simple flat truthy assertions. We will make different flattenings under different circumstances, depending on risk scenarios, data availability and other worldly constraints. This is natural and healthy, and leaves RDF as simple propositional content while admitting that there is (e.g. via SPARQL) a rich set of data management practices around it that absolutely do need to deal pragmatically with time, change and provenance. cheers, Dan
Received on Thursday, 13 October 2011 14:21:52 UTC