Re: Graphs and Being and Time from Dan Brickley on 2011-02-23 (public-rdf-wg@w3.org from February 2011)

From: Dan Brickley <danbri@danbri.org>
Date: Wed, 23 Feb 2011 22:54:10 +0100
To: Pat Hayes <phayes@ihmc.us>
Cc: public-rdf-wg@w3.org
Message-ID: <AANLkTim1UJPO5fjb=8bt46d2rx=CdJEAGM1ogW9iUuzN@mail.gmail.com>
On 23 February 2011 22:46, Pat Hayes <phayes@ihmc.us> wrote:
>
> On Feb 23, 2011, at 2:43 PM, Dan Brickley wrote:
>
>> On 23 February 2011 21:13, Pat Hayes <phayes@ihmc.us> wrote:
>> [...]
>>> What we need is the notion of a 'graph token' (or some other terminology: see below for more on terminology), meaning an actual representation of an RDF graph.
>> [...]
>>
>> I like this thinking. A natural term here might be 'document'.
>
> Hmmm. Maybe, but I actually avoided that term deliberately, partly to avoid getting into debates about whether or not some chunk of a quad store counted as a document (for example) and partly because it seems reasonable to allow some kinds of document to contain several graph tokens, or even to allow a notion of graph token that is spread over several documents. I suspect, reading on, that you have in mind a more general notion of document than I was thinking of (my paradigm "document" is an RDF/XML file), in which case we are probably in agreement over everything except terminology.

If we can avoid trying to define 'document', we'll save a lot of time,
true... but for lots of of developers there's a concrete notion
roughly around 'data item', 'record', 'file', 'document' that this is
close to.

>> I've
>> tended to approach things along lines that "documents [in certain
>> social/technical/protocol settings] express claims about the world".
>> Documents have concrete formats, and security-related characteristics
>> eg. they can be hashed. And beyond this, document types have
>> constraints that relate to communication needs rather than abstract
>> models of the world. In pure RDF/OWL we can say in general that people
>> have 'given names'; however when thinking about document types, things
>> are more imperative --- something is a properly shaped ShippingOrder
>> if it  is a document that carries the right kinds of claim, for eg.
>> including some claims about the "given name" of its creator. A common
>> developer frustration with practical RDF and OWL is that it is so
>> passive and declarative and non-commital: our schemas and ontologies
>> are essentially dictionaries
>
>  Dictionaries??? (Crosses self to ward off the devil speaking.)

metaphorically ... they describe the pieces we use to make sentences,
without telling us anything more about what to say.

When people looked at RDF after immersion in the more uptight world of
DTDs and XML schemas, this was ... strange. 'Dictionary' was the best
I can come up with to give this oddity a more familiar form.

>> , and don't force data publishers to stick
>> to any particular content structures. You can always omit data, always
>> add data, ... RDF only really cares about the truth and not
>> contradicting yourself. Document-centric schema languages, by
>> contrast, have lots more ways of screwing up: missing data, missing
>> patterns of data, ... failings of communication rather than failings
>> of description. So these are legitimate, well grounded developer
>> expectations which RDF at the moment just bypasses. Instead of
>> relatively predictable doc format or OO structures, we offer only a
>> chaotic bundle of triples; unconstrained by any human or machine
>> protocol except in that they are gently nudged towards being truthful.
>>
>> As Ed Dumbill put it (http://times.usefulinc.com/#13:13 via
>> http://danbri.org/words/page/27?sioc_type=user&sioc_id=22 )
>>
>> "Processing RDF is therefore a matter of poking around in this graph.
>> Once a program has read in some RDF, it has a ball of spaghetti on its
>> hands. You may like to think of RDF in the same way as a hashtable
>> data structure — you can stick whatever you want in there, in whatever
>> order you want."
>>
>> I think this goes to the heart of developer frustration with RDF.
>
> OK, so this is an interesting and quite deep issue that you are raising here, about RDF having an unexpectedly 'loose' organization, almost a total lack of organization. Without commenting either way on this issue, I would like to emphasize that it is NOT what I was talking about.

Yup, sorry it's "things I had bottled up and your comments dislodged..."...

>>
>> We have had an awkward narrative gap since 1997 regarding how RDF
>> relates to XML and other document-oriented schema systems. By making
>> the notion of a document more explicit in the RDF specs, I think we
>> get a bit closer to bridging that gap and explaining how the two
>> approaches can be complementary. RDF lets us define abstract
>> structures for modeling the world; specific uses of RDF in document
>> formats encode a set of claims about the world. And other non-RDF doc
>> formats often do the same, often using schema languages that focus on
>> the document side rather than the abstract model. Following this line
>> of thinking we can rebuild some of the machinery of DTDs and XML
>> schema over the top of RDF's machinery, by defining types of RDF
>> document in terms of the patterns of claims it encodes.
>
> Maybe. But even further from what I was talking about :-)
>
> But this might be awfully relevant to the JSON syntax decisions, seems to me (?)

In that developers are hard to please. They also like the scruffy,
unconstrained side to JSON :)

>> Ok that's a bit much to 'standardise' but it's my sense for where
>> we're collectively heading anyway. Just as practical RDF systems
>> always invent some way of keeping track of where claims/triples came
>> from, they often have some ad hoc mechanisms for poking around in some
>> RDF graph instance to see what kind of a description it is: is it an
>> addressbook description, info about a calendar, some chunk of a
>> thesaurus or perhaps some claims about a geographical point of
>> interest? By making the notion of document explicit in RDF, we open
>> things up for these unarticulated but useful concepts to be used in
>> code, documentation and data. Without it, we have this kind of awkward
>> situation where we have documents that try to ignore the 'bunch of
>> bytes' side of their character and pretend to be purely mathematical
>> entities. RDF in practice has these two sides to its personality, and
>> I think we'll do better explaining them if we name and describe them
>> in the specs...
>
> LIke I say, maybe. But I was only wanting to get the basic, simple distinction made between an abstract RDF graph and a concrete actual token of that same RDF graph.  I guess however that this might impact your goal here in that it would make sense to describe a graph token/resource/document as being written in some concrete "document syntax", which simply would not make sense for an abstract RDF graph.

Yes, regardless of what people think of my ponderings here, that basic
distinction is well worth making. Not meaning to muddy the waters,

Dan



> Pat
>
>>
>> cheers,
>>
>> Dan
>>
>
> ------------------------------------------------------------
> IHMC                                     (850)434 8903 or (650)494 3973
> 40 South Alcaniz St.           (850)202 4416   office
> Pensacola                            (850)202 4440   fax
> FL 32502                              (850)291 0667   mobile
> phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
>
>
>
>
>
>
Received on Wednesday, 23 February 2011 21:54:43 UTC