Re: implied datasets

* [2011-05-23 14:46:47 +0100] Leigh Dodds <leigh.dodds@talis.com> écrit:

] I'm not sure that the dataset is "imaginary", but what you're doing
] seems eminently sensible to me. I've been working on a little project
] that I hope to release shortly that aims to facilitate this kind of
] linking, especially where those non-URI identifiers, or Literal Keys
] [1] are used to build patterned URIs.

The thing is, as with Hugh's suggestion, as a curator of datasets I
have little control or influence over how the dataset authors choose
to do this. I have noticed a common pattern though (urn:issn for
example) and encouraging patterns like this is helpful I think.

] It may be more natural to thing of these more as services though than
] datasets. i.e. a service that accepts some keys as input and returns a
] set of assertions. In this case the assertions would be links to other
] datasets.

This is a bit different. I was thinking of an implied dataset that 
would have no links outwards at all. 

] Subsets if they only asserted sameAs links, but I think you're
] suggesting that this may be too strict. I think there's potentially a
] whole set of related "predicate based services" [2] that provide
] useful indexes of existing datasets, or expose additional annotations
] of extra sources.

So this would be a separation of edge-labelled graphs into a bunch
of perhaps more manageable basic (V,E) graphs. An interesting way
of chopping things up.

The reason I think sameAs is too strict, aside from people putting
sameAs when they really mean similarTo, can be shown by another
library example. Broadly there seem to be two strategies for
representing things like books, the flat BIBO style and the more
elaborate FRBR/WEMI style. So if I have two datasets, one in each,
I might have something like,

<ds1:flc> a bibo:Book;
  dc:title "The Feynman Lectures on Computation";
  dc:creator [ foaf:name "Richard Feynman" ];
  dc:language "eng";
  owl:sameAs <urn:isbn:0738202967>.

<ds2:flc> a frbr:Manifestation;
  frbr:manifestationOf [
    a frbr:Expression;
    dc:language "en";
    frbr:expressionOf [
       a frbr:Work;
       dc:title "The Feynman Lectures on Computation";
       dc:creator [ foaf:name "Richard Feynman" ]
    ]
  ];
  owl:sameAs <urn:isbn:0738202967>.

Both the authors have done something prima facie reasonable with the
sameAs but if you actually run it transitively you get into trouble.

This also goes to what Glenn was saying. These datasets are obviously
related in a meaningful way, there may well be useful ways for someone
who studies them to draw links between them but it isn't as simple as
saying they both have things of the same type. In fact what type
assertions are appropriate to clarify the relationship between these
datasets is the type of analysis that I would want to facilitate, not
try to do up front. What I can say is they both have references (that
may or may not be strictly believable) to this funny
non-dereferenceable URI (or equivalently, string literal of a certain
kind).

Cheers,
-w

-- 
William Waites                <mailto:ww@styx.org>
http://river.styx.org/ww/        <sip:ww@styx.org>
F4B3 39BF E775 CF42 0BAB  3DF0 BE40 A6DF B06F FD45

Received on Monday, 23 May 2011 21:29:23 UTC