dataset branching (a la git)

hi,

when trying to classify the animals on pictures from a recent trip to
eastern indonesia
meticulously realized that it is very hard if not impossible to branch
datasets with ease.
while this might sound ignoreable at first sight it might as well be the
reason for the giant global graph to develop a culture of duplicating and
linking with the end effect of being very close to where we came from (many
sql databases).

what i mean will hopefully become clear with a simple example :

the "manta birostris" (giant oceanic manta ray) is classified

her wikipedia.org as
Kingdom: Animalia
Phylum: Chordata
Class: Chondrichthyes
Subclass: Elasmobranchii
Order: Myliobatiformes
Suborder: Myliobatidae
Family: Mobulidae
Genus: Manta
Species: Manta birostris

here http://www.catalogueoflife.org/col/browse/tree/id/18879368 as
Kingdom: Animalia
Phylum: Chordata
Class: Elasmobranchii
Order: Myliobatiformes
Family: Myliobatidae
Genus: Manta
Species: Manta birostris

here http://www.marinespecies.org/aphia.php?p=browser&id=105755#ct as
Kingdom: Animalia
Phylum: Chordata
Subphylum: Vertebrata
Superclass: Gnathostomata
Superclass Pisces (Unreviewed)
Class: Elasmobranchii (Unreviewed)
Subclass: Neoselachii (Unreviewed)
Infraclass: Batoidea (Unreviewed)
Order: Rajiformes
Family: Myliobatidae (Unreviewed)
Subfamily: Mobulinae
Genus: Manta
Species: Manta birostris

here http://data.gbif.org/species/2419163/ as
Kingdom: Animalia
Phylum: Chordata
Class: Elasmobranchii
Order: Myliobatiformes
Family: Myliobatidae
Genus: Manta
Species: Manta birostris

if only in theory we would triplify all these datasets and link them it
still would be very hard to find out what different people think about the
actually same being.

now:

my thinking was to create a flat list of uris for => all <= these
classifications and create branches (graphs) with the hierarchies. but it
is not as simple as it sounds because i cannot make the sparql engine
follow a branch at certain uris and the rejoin the master graph again by
whatever means. neither can i do such things on data level.

i was thinking about like so [1] on a triple (quad) level.

questions:

1. is the problem described so that it is at least semi-understandable (or
should i come up with some triples as example)
2. has this problem already been dealt with and i was only missing that day
(please provide a link)
3. has this problem already been solved and i was only missing that day
(please provide a link)
4. do you think it is worth dealing with
    (i personally think so [think: scaling cooperation ])
5. would be a of enough interest to create a wg

any pointers and thoughts highly appreciated
wkr turnguard


[1] http://nvie.com/posts/a-successful-git-branching-model/

| Jürgen Jakobitsch,
| Software Developer
| Semantic Web Company GmbH
| Mariahilfer Straße 70 / Neubaugasse 1, Top 8
| A - 1070 Wien, Austria
| Mob +43 676 62 12 710 | Fax +43.1.402 12 35 - 22

COMPANY INFORMATION
| web       : http://www.semantic-web.at/
| foaf      : http://company.semantic-web.at/person/juergen_jakobitsch
PERSONAL INFORMATION
| web       : http://www.turnguard.com
| foaf      : http://www.turnguard.com/turnguard
| g+        : https://plus.google.com/111233759991616358206/posts
| skype     : jakobitsch-punkt
| xmlns:tg  = "http://www.turnguard.com/turnguard#"

Received on Thursday, 2 October 2014 20:03:15 UTC