Re: Data modelling from Erik.Wilde@emc.com on 2012-06-19 (public-ldp-wg@w3.org from June 2012)

From: <Erik.Wilde@emc.com>
Date: Tue, 19 Jun 2012 15:45:01 -0400
To: <andy.seaborne@epimorphics.com>, <public-ldp-wg@w3.org>
Message-ID: <CC061CE7.7143%erik.wilde@emc.com>
hello andy.

On 2012-06-19 3:54 , "Andy Seaborne" <andy.seaborne@epimorphics.com> wrote:
>Interesting point about SOAI think the submission is already in this
>place though because there is a vocabulary for the BPRs and especially
>the BPCs.  By using dcterms:modified or dcterms:title, we need to be
>clear what is the subject resource.

yes, absolutely agreed. when looking at atom as a pattern for this, the
subject is the entry, which is a representation (as maintained by the
collection owner) of the described thing. in atompub, for some cases, the
idea of a "media resource" has been added, because in that case, the
described thing may be PDF or GIF or MP3, and then the idea is that the
entry is the protocol-level description of the thing, and the thing itself
also is accessible as potentially something different. if that happens (in
atompub), you may get both "edit" and "edit-media" as link relations in an
entry, because you can either update the description (edit), or update the
described media resource (edit-media). this duality has been built into
atompub so that there can be a protocol that allows interactions with both
the description and the described resource.

both atom and atompub have no standardized way how an entry can relate to
something outside of the protocol realm, and
http://tools.ietf.org/html/draft-wilde-describes-link-00 could be one way
of doing this. in the atom/atompub design space, it would just be one more
possible link relation clients could understand (and based on the current
working of 'describes', it is defined as an identifier and not as a link
that has to be dereferencable).

>In geospatial databases, at least in ones I've looked at, there is
>conflation of the record about the thing and the thing itself, indeed
>the language is usually to avoid the real work item at all.  It works
>because there is also an assumption that each physical object modelled
>has one and only one record about it.  The fact that some fields of the
>record are about the metadata and some the object itself does not matter
>(much).

this boils down to the whole httprange-14 question, and let's hope we're
not running into some requirement that will mean we have to solve the
issue. personally and philosophically, i tend to think that claiming
you're talking about "a thing itself" instead of "just a description" is
not something you can do; you're just pushing things into a different
layer. identity is established by convention, and as you say, as long as
these systems handle things in a way that works (for most users, at
least), that convention seems to be good enough.

>But this has limitations when you consider linking across two databases.
>  How can an application take an identifier for the "Bridge of Sighs"
>[1] in one geo DB (say, of global places) and use it to link to another
>(say, UK specific).  How can it also say that X thinks it is in Venice
>and Y thinks it is Cambridge [2] or Oxford [3] (which isn't even called
>the Bridge of Sighs!).
>[1] http://en.wikipedia.org/wiki/Bridge_of_Sighs
>[2] http://en.wikipedia.org/wiki/Bridge_of_Sighs_%28Cambridge%29
>[3] http://en.wikipedia.org/wiki/Bridge_of_Sighs_%28Oxford%29

you're right that they cannot do that until they agree on a convention on
a shared identifier space. if they do that, 'describes' would be one way
of doing it, allowing relationships that "connect" at the identities of
described resources. you then join graphs by matching URIs of 'describes'
targets.

>"atom:updated" recognizes the issue that the time may be the entry or
>the feed.  They are closely linked.  But in the wider use of an LDP,
>e.g. the stocks example of the submission, the record about the resource
>and the resource itself are not so closely linked.

linkage between a described/represented resource and how it shows up in a
service (the service surface of the description you can get) is entirely
up to the service, i don't think that the platform itself should make any
claims about that.

>There are many ways to address this - there may well be others as well
>(my list got longer as I wrote it!)
>1/ specifically defined predicates that so we have ":recordModified" and
>":resourceModified".  This feels like it will get messy; it does
>preclude reusing other vocabularies not designed for this.

this indeed might get messy. i am not quite clear what precludes us from
not claiming to talk about "the things themselves" to start with. if a
service decides that it wants to refer to "a thing itself", 'describes' or
similar mechanisms can be used. what the platform should do (in my
opinion) is to provide patterns how clients interact with a service.

>2/ weak predicates, like the atom approach ":changed".  Atom is in a
>slightly different position of the feed and the entry are more closely
>linked but if we are federating then I'm not sure this will stand up.

you refer to "updated" here, right? one of the main design goals was to
allow feeds to be aggregated, filtered, processed, and otherwise pipelined
in a multi-layered scenario, and the main vehicles for this have been
entry identity, and time stamps. this may not be what's required in all
cases, but i think it holds up fairly well in federated scenarios. but i
suspect we might have different ideas about what "federated" means.

>3/ having two URIs - one for the record, one for the resource. They can
>be in the same message body.

that's what atompub does for media resources, because there has to be
interaction support for them ('edit-media' links), and what could be done
more generally (without interaction support, just based on identity) with
a 'describes' relation, if publishers want to do it.

cheers,

dret.
Received on Tuesday, 19 June 2012 19:45:42 UTC