- From: Kingsley Idehen <kidehen@openlinksw.com>
- Date: Thu, 02 Oct 2014 19:02:03 -0400
- To: Jürgen Jakobitsch <j.jakobitsch@semantic-web.at>
- CC: public-lod@w3.org
- Message-ID: <542DD96B.7080009@openlinksw.com>
On 10/2/14 6:19 PM, Jürgen Jakobitsch wrote: > ok - i guess i should come up with an example : > > what i want to achieve is for example that people can rewrite part of > a dataset and be able to get their version of the complete dataset. Okay. > > i.e. (java code) > > i clone a whole repository, change one single line in one java file > and still be able to compile the whole project. > > i.e. (rdf code) > > master data (in graph http://graphs.net/master) (a flat list) > > <http://s.org/a> <http://p.net/label> "europe" . > <http://s.org/b> <http://p.net/label> "central europe" . > <http://s.org/c> <http://p.net/label> "austria" . > <http://s.org/d> <http://p.net/label> "carinthia" . > <http://s.org/e> <http://p.net/label> "klagenfurt" . > <http://s.org/f> <http://p.net/label> "st.martin" . Nanotation [1] markers for generating sample data from this post, if required further on in the discussion. ## Nanotation Start ## </document1> <#europe> <#label> "europe" . <#centralEurope> <#label> "central europe" . <#austria> <#label> "austria" . <#carinthia> <#label> "carinthia" . <#klagenfurt> <#label> "klagenfurt" . <#stMarting> <#label> "st.martin" . ## Nanotation End ## > > person A (in graph http://graphs.net/persons/a) (= a branch with a > hierarchy) > (note : person A is at time T1 not an expert and doesn't know about > "carinthia" being an austrian state) > > <http://s.org/a> skos:narrower <http://s.org/b> . > <http://s.org/b> skos:narrower <http://s.org/c> . > <http://s.org/c> skos:narrower <http://s.org/e> . > ## Nanotation Start ## </document2> <#europe> skos:narrower <#uk>. <#centralEurope> skos:narrower <#bulgaria> . <#austria> skos:narrower <#vienna> . ## Nanotation End ## > person B (in graph http://graphs.net/persons/b) (= a branch with a > [better] hierarchy) > (note : person B is an expert on austrian geography and knows about > "carinthia" being an austrian state) > > <http://s.org/a> skos:narrower <http://s.org/b> . > <http://s.org/b> skos:narrower <http://s.org/c> . > <http://s.org/c> skos:narrower <http://s.org/d> . > <http://s.org/d> skos:narrower <http://s.org/e> . > ## Nanotation Start ## ## Vienna and Carinthia conflict </document3> <#austria> skos:narrower <#carinthia> . ## Nanotation End ## > what happend becomes clear when take one step back and realize that > all the relations (skos:narrower) have been duplicated. > > now say person C is a senior expert on the municipalities andboroughs > in the city of "klagenfurt". > person C agrees with the graph from person B but wants to extend it. > in this simple example person => could <= > simply add triples in http://graphs.net/persons/c beginning with > <http://s.org/e> skos:narrower <http://s.org/f> . > > and i could select do a > > SELECT > FROM <http://graphs.net/master> > FROM <http://graphs.net/persons/b> > FROM <http://graphs.net/persons/c> > > to get complete and happy result. SELECT * FROM </document1> FROM </document2> FROM </document3> WHERE { ?s ?p ?o . VALUES FILTER (NOT EXISTS {<#austria> skos:narrower <#vienna> } ) } OR ## Using NOT FROM extension we've implemented SELECT * NOT FROM </document2> WHERE { ?s ?p ?o . } There are other options. > > now, besides copying triples like > <http://s.org/a> skos:narrower <http://s.org/b> . > <http://s.org/b> skos:narrower <http://s.org/c> . > this example works when appending to the end of the hierarchy. > > what you cannot simply do is for example replace a triple in a branch > (graph) But you can filter out a named graph. Of course there's more, I could even generate live data from the Nanotations embedded in this post, but that's a last resort. I have a like example of triples created via nanotation laced tweets that might demonstrate this shuffling in and out of named graphs used in a SPARQL processing pipeline [2][3][4][5][6][7]. > > say person D agrees with person B mostly, only "Central Europe" is no > political entity and therefor doesn't have to do anything in the > hierarchy. > > person D could actually only copy the graph and adjust the triples > accordingly (but that is again copying) > > > now this copying i don't like. > > let's come back to the initial example of a biological classification. > i just triplified the catalogoflife.org <http://catalogoflife.org> > downloadable dataset and currently have 1775844 entities and with a > couple of different opions from > a couple of different scientists this soon goes into billions of triples. > > ;-) i still should think about how express the problem that i see but > i need to start somewhere and writing such things down really helps > sometimes.. > > wkr j > > Hopefully, this illustrates your fundamental quest? Links: [1] http://bit.ly/blog-post-about-nanotation [2] http://linkeddata.uriburner.com/c/9GDYGU3 -- Everything [3] http://linkeddata.uriburner.com/fct/rdfdesc/usage.vsp?g=https%3A%2F%2Ftwitter.com%2Fhashtag%2FNoSilo%23this -- all the named graphs contributing to the SPARQL solution behind this page [4] http://linkeddata.uriburner.com/c/9CJLOKIL -- same page with a specific named graph (internal document DB id/name) designated as the data source [5] http://linkeddata.uriburner.com/fct/rdfdesc/usage.vsp?g=https%3A%2F%2Ftwitter.com%2Fhashtag%2FNoSilo%23this -- shows the designated named graph data source (hatched in the UI) [6] http://linkeddata.uriburner.com/fct/rdfdesc/usage.vsp?g=https%3A%2F%2Ftwitter.com%2Fhashtag%2FNoSilo%23this -- two named graphs specifically designated as data sources [7] http://linkeddata.uriburner.com/c/9CT5GRUZ -- effect of the two named graphs specifically designated as data sources . Kingsley > 2014-10-02 23:42 GMT+02:00 Kingsley Idehen <kidehen@openlinksw.com > <mailto:kidehen@openlinksw.com>>: > > On 10/2/14 4:02 PM, Jürgen Jakobitsch wrote: >> hi, >> >> when trying to classify the animals on pictures from a recent >> trip to eastern indonesia >> meticulously realized that it is very hard if not impossible to >> branch datasets with ease. >> while this might sound ignoreable at first sight it might as well >> be the reason for the giant global graph to develop a culture of >> duplicating and linking with the end effect of being very close >> to where we came from (many sql databases). >> >> what i mean will hopefully become clear with a simple example : >> >> the "manta birostris" (giant oceanic manta ray) is classified >> >> her wikipedia.org <http://wikipedia.org> as >> Kingdom:Animalia >> Phylum:Chordata >> Class:Chondrichthyes >> Subclass:Elasmobranchii >> Order:Myliobatiformes >> Suborder:Myliobatidae >> Family:Mobulidae >> Genus:Manta >> Species:Manta birostris >> >> here http://www.catalogueoflife.org/col/browse/tree/id/18879368 as >> Kingdom: Animalia >> Phylum: Chordata >> Class: Elasmobranchii >> Order: Myliobatiformes >> Family: Myliobatidae >> Genus: Manta >> Species: Manta birostris >> >> here http://www.marinespecies.org/aphia.php?p=browser&id=105755#ct as >> Kingdom: Animalia >> Phylum: Chordata >> Subphylum: Vertebrata >> Superclass: Gnathostomata >> Superclass Pisces (Unreviewed) >> Class: Elasmobranchii (Unreviewed) >> Subclass: Neoselachii (Unreviewed) >> Infraclass: Batoidea (Unreviewed) >> Order: Rajiformes >> Family: Myliobatidae (Unreviewed) >> Subfamily: Mobulinae >> Genus: Manta >> Species: Manta birostris >> >> here http://data.gbif.org/species/2419163/ as >> Kingdom: Animalia >> Phylum: Chordata >> Class: Elasmobranchii >> Order: Myliobatiformes >> Family: Myliobatidae >> Genus: Manta >> Species: Manta birostris >> >> if only in theory we would triplify all these datasets and link >> them it still would be very hard to find out what different >> people think about the actually same being. >> >> now: >> >> my thinking was to create a flat list of uris for => all <= these >> classifications and create branches (graphs) with the >> hierarchies. but it is not as simple as it sounds because i >> cannot make the sparql engine follow a branch at certain uris and >> the rejoin the master graph again by whatever means. > > You mean that you can't de-reference a SPARQL query pattern > variable as part of a SPARQL query processing pipeline? > >> neither can i do such things on data level. > > If the data is in 5-star Linked Open Data form you have the data > network in place. Then its about a SPARQL query that crawls the > data-network. Ultimately, each entity description document SHOULD > end up being an internal triples/quad store document identifier > (a/k/a named graph IRI). > > Naturally, what I describe above is how Virtuoso will behave is > you include input:grab pragmas in your SPARQL. >> >> i was thinking about like so [1] on a triple (quad) level. >> >> questions: >> >> 1. is the problem described so that it is at least >> semi-understandable (or should i come up with some triples as >> example) > > I think so, but not 100% certain :) > >> 2. has this problem already been dealt with and i was only >> missing that day (please provide a link) > > Sorta, in some other conversations about LOD cloud crawling and > SPARQL. > >> 3. has this problem already been solved and i was only missing >> that day (please provide a link) >> 4. do you think it is worth dealing with >> (i personally think so [think: scaling cooperation ]) >> 5. would be a of enough interest to create a wg >> >> any pointers and thoughts highly appreciated >> wkr turnguard >> > > > -- > Regards, > > Kingsley Idehen > Founder & CEO > OpenLink Software > Company Web:http://www.openlinksw.com > Personal Weblog 1:http://kidehen.blogspot.com > Personal Weblog 2:http://www.openlinksw.com/blog/~kidehen <http://www.openlinksw.com/blog/%7Ekidehen> > Twitter Profile:https://twitter.com/kidehen > Google+ Profile:https://plus.google.com/+KingsleyIdehen/about > LinkedIn Profile:http://www.linkedin.com/in/kidehen > Personal WebID:http://kingsley.idehen.net/dataspace/person/kidehen#this > > -- Regards, Kingsley Idehen Founder & CEO OpenLink Software Company Web: http://www.openlinksw.com Personal Weblog 1: http://kidehen.blogspot.com Personal Weblog 2: http://www.openlinksw.com/blog/~kidehen Twitter Profile: https://twitter.com/kidehen Google+ Profile: https://plus.google.com/+KingsleyIdehen/about LinkedIn Profile: http://www.linkedin.com/in/kidehen Personal WebID: http://kingsley.idehen.net/dataspace/person/kidehen#this
Attachments
- application/pkcs7-signature attachment: S/MIME Cryptographic Signature
Received on Thursday, 2 October 2014 23:02:28 UTC