- From: Kingsley Idehen <kidehen@openlinksw.com>
- Date: Thu, 02 Oct 2014 19:02:03 -0400
- To: Jürgen Jakobitsch <j.jakobitsch@semantic-web.at>
- CC: public-lod@w3.org
- Message-ID: <542DD96B.7080009@openlinksw.com>
On 10/2/14 6:19 PM, Jürgen Jakobitsch wrote:
> ok - i guess i should come up with an example :
>
> what i want to achieve is for example that people can rewrite part of
> a dataset and be able to get their version of the complete dataset.
Okay.
>
> i.e. (java code)
>
> i clone a whole repository, change one single line in one java file
> and still be able to compile the whole project.
>
> i.e. (rdf code)
>
> master data (in graph http://graphs.net/master) (a flat list)
>
> <http://s.org/a> <http://p.net/label> "europe" .
> <http://s.org/b> <http://p.net/label> "central europe" .
> <http://s.org/c> <http://p.net/label> "austria" .
> <http://s.org/d> <http://p.net/label> "carinthia" .
> <http://s.org/e> <http://p.net/label> "klagenfurt" .
> <http://s.org/f> <http://p.net/label> "st.martin" .
Nanotation [1] markers for generating sample data from this post, if
required further on in the discussion.
## Nanotation Start ##
</document1>
<#europe> <#label> "europe" .
<#centralEurope> <#label> "central europe" .
<#austria> <#label> "austria" .
<#carinthia> <#label> "carinthia" .
<#klagenfurt> <#label> "klagenfurt" .
<#stMarting> <#label> "st.martin" .
## Nanotation End ##
>
> person A (in graph http://graphs.net/persons/a) (= a branch with a
> hierarchy)
> (note : person A is at time T1 not an expert and doesn't know about
> "carinthia" being an austrian state)
>
> <http://s.org/a> skos:narrower <http://s.org/b> .
> <http://s.org/b> skos:narrower <http://s.org/c> .
> <http://s.org/c> skos:narrower <http://s.org/e> .
>
## Nanotation Start ##
</document2>
<#europe> skos:narrower <#uk>.
<#centralEurope> skos:narrower <#bulgaria> .
<#austria> skos:narrower <#vienna> .
## Nanotation End ##
> person B (in graph http://graphs.net/persons/b) (= a branch with a
> [better] hierarchy)
> (note : person B is an expert on austrian geography and knows about
> "carinthia" being an austrian state)
>
> <http://s.org/a> skos:narrower <http://s.org/b> .
> <http://s.org/b> skos:narrower <http://s.org/c> .
> <http://s.org/c> skos:narrower <http://s.org/d> .
> <http://s.org/d> skos:narrower <http://s.org/e> .
>
## Nanotation Start ##
## Vienna and Carinthia conflict
</document3>
<#austria> skos:narrower <#carinthia> .
## Nanotation End ##
> what happend becomes clear when take one step back and realize that
> all the relations (skos:narrower) have been duplicated.
>
> now say person C is a senior expert on the municipalities andboroughs
> in the city of "klagenfurt".
> person C agrees with the graph from person B but wants to extend it.
> in this simple example person => could <=
> simply add triples in http://graphs.net/persons/c beginning with
> <http://s.org/e> skos:narrower <http://s.org/f> .
>
> and i could select do a
>
> SELECT
> FROM <http://graphs.net/master>
> FROM <http://graphs.net/persons/b>
> FROM <http://graphs.net/persons/c>
>
> to get complete and happy result.
SELECT *
FROM </document1>
FROM </document2>
FROM </document3>
WHERE { ?s ?p ?o .
VALUES
FILTER (NOT EXISTS {<#austria> skos:narrower <#vienna> } )
}
OR
## Using NOT FROM extension we've implemented
SELECT *
NOT FROM </document2>
WHERE { ?s ?p ?o . }
There are other options.
>
> now, besides copying triples like
> <http://s.org/a> skos:narrower <http://s.org/b> .
> <http://s.org/b> skos:narrower <http://s.org/c> .
> this example works when appending to the end of the hierarchy.
>
> what you cannot simply do is for example replace a triple in a branch
> (graph)
But you can filter out a named graph.
Of course there's more, I could even generate live data from the
Nanotations embedded in this post, but that's a last resort. I have a
like example of triples created via nanotation laced tweets that might
demonstrate this shuffling in and out of named graphs used in a SPARQL
processing pipeline [2][3][4][5][6][7].
>
> say person D agrees with person B mostly, only "Central Europe" is no
> political entity and therefor doesn't have to do anything in the
> hierarchy.
>
> person D could actually only copy the graph and adjust the triples
> accordingly (but that is again copying)
>
>
> now this copying i don't like.
>
> let's come back to the initial example of a biological classification.
> i just triplified the catalogoflife.org <http://catalogoflife.org>
> downloadable dataset and currently have 1775844 entities and with a
> couple of different opions from
> a couple of different scientists this soon goes into billions of triples.
>
> ;-) i still should think about how express the problem that i see but
> i need to start somewhere and writing such things down really helps
> sometimes..
>
> wkr j
>
>
Hopefully, this illustrates your fundamental quest?
Links:
[1] http://bit.ly/blog-post-about-nanotation
[2] http://linkeddata.uriburner.com/c/9GDYGU3 -- Everything
[3]
http://linkeddata.uriburner.com/fct/rdfdesc/usage.vsp?g=https%3A%2F%2Ftwitter.com%2Fhashtag%2FNoSilo%23this
-- all the named graphs contributing to the SPARQL solution behind this page
[4] http://linkeddata.uriburner.com/c/9CJLOKIL -- same page with a
specific named graph (internal document DB id/name) designated as the
data source
[5]
http://linkeddata.uriburner.com/fct/rdfdesc/usage.vsp?g=https%3A%2F%2Ftwitter.com%2Fhashtag%2FNoSilo%23this
-- shows the designated named graph data source (hatched in the UI)
[6]
http://linkeddata.uriburner.com/fct/rdfdesc/usage.vsp?g=https%3A%2F%2Ftwitter.com%2Fhashtag%2FNoSilo%23this
-- two named graphs specifically designated as data sources
[7] http://linkeddata.uriburner.com/c/9CT5GRUZ -- effect of the two
named graphs specifically designated as data sources .
Kingsley
> 2014-10-02 23:42 GMT+02:00 Kingsley Idehen <kidehen@openlinksw.com
> <mailto:kidehen@openlinksw.com>>:
>
> On 10/2/14 4:02 PM, Jürgen Jakobitsch wrote:
>> hi,
>>
>> when trying to classify the animals on pictures from a recent
>> trip to eastern indonesia
>> meticulously realized that it is very hard if not impossible to
>> branch datasets with ease.
>> while this might sound ignoreable at first sight it might as well
>> be the reason for the giant global graph to develop a culture of
>> duplicating and linking with the end effect of being very close
>> to where we came from (many sql databases).
>>
>> what i mean will hopefully become clear with a simple example :
>>
>> the "manta birostris" (giant oceanic manta ray) is classified
>>
>> her wikipedia.org <http://wikipedia.org> as
>> Kingdom:Animalia
>> Phylum:Chordata
>> Class:Chondrichthyes
>> Subclass:Elasmobranchii
>> Order:Myliobatiformes
>> Suborder:Myliobatidae
>> Family:Mobulidae
>> Genus:Manta
>> Species:Manta birostris
>>
>> here http://www.catalogueoflife.org/col/browse/tree/id/18879368 as
>> Kingdom: Animalia
>> Phylum: Chordata
>> Class: Elasmobranchii
>> Order: Myliobatiformes
>> Family: Myliobatidae
>> Genus: Manta
>> Species: Manta birostris
>>
>> here http://www.marinespecies.org/aphia.php?p=browser&id=105755#ct as
>> Kingdom: Animalia
>> Phylum: Chordata
>> Subphylum: Vertebrata
>> Superclass: Gnathostomata
>> Superclass Pisces (Unreviewed)
>> Class: Elasmobranchii (Unreviewed)
>> Subclass: Neoselachii (Unreviewed)
>> Infraclass: Batoidea (Unreviewed)
>> Order: Rajiformes
>> Family: Myliobatidae (Unreviewed)
>> Subfamily: Mobulinae
>> Genus: Manta
>> Species: Manta birostris
>>
>> here http://data.gbif.org/species/2419163/ as
>> Kingdom: Animalia
>> Phylum: Chordata
>> Class: Elasmobranchii
>> Order: Myliobatiformes
>> Family: Myliobatidae
>> Genus: Manta
>> Species: Manta birostris
>>
>> if only in theory we would triplify all these datasets and link
>> them it still would be very hard to find out what different
>> people think about the actually same being.
>>
>> now:
>>
>> my thinking was to create a flat list of uris for => all <= these
>> classifications and create branches (graphs) with the
>> hierarchies. but it is not as simple as it sounds because i
>> cannot make the sparql engine follow a branch at certain uris and
>> the rejoin the master graph again by whatever means.
>
> You mean that you can't de-reference a SPARQL query pattern
> variable as part of a SPARQL query processing pipeline?
>
>> neither can i do such things on data level.
>
> If the data is in 5-star Linked Open Data form you have the data
> network in place. Then its about a SPARQL query that crawls the
> data-network. Ultimately, each entity description document SHOULD
> end up being an internal triples/quad store document identifier
> (a/k/a named graph IRI).
>
> Naturally, what I describe above is how Virtuoso will behave is
> you include input:grab pragmas in your SPARQL.
>>
>> i was thinking about like so [1] on a triple (quad) level.
>>
>> questions:
>>
>> 1. is the problem described so that it is at least
>> semi-understandable (or should i come up with some triples as
>> example)
>
> I think so, but not 100% certain :)
>
>> 2. has this problem already been dealt with and i was only
>> missing that day (please provide a link)
>
> Sorta, in some other conversations about LOD cloud crawling and
> SPARQL.
>
>> 3. has this problem already been solved and i was only missing
>> that day (please provide a link)
>> 4. do you think it is worth dealing with
>> (i personally think so [think: scaling cooperation ])
>> 5. would be a of enough interest to create a wg
>>
>> any pointers and thoughts highly appreciated
>> wkr turnguard
>>
>
>
> --
> Regards,
>
> Kingsley Idehen
> Founder & CEO
> OpenLink Software
> Company Web:http://www.openlinksw.com
> Personal Weblog 1:http://kidehen.blogspot.com
> Personal Weblog 2:http://www.openlinksw.com/blog/~kidehen <http://www.openlinksw.com/blog/%7Ekidehen>
> Twitter Profile:https://twitter.com/kidehen
> Google+ Profile:https://plus.google.com/+KingsleyIdehen/about
> LinkedIn Profile:http://www.linkedin.com/in/kidehen
> Personal WebID:http://kingsley.idehen.net/dataspace/person/kidehen#this
>
>
--
Regards,
Kingsley Idehen
Founder & CEO
OpenLink Software
Company Web: http://www.openlinksw.com
Personal Weblog 1: http://kidehen.blogspot.com
Personal Weblog 2: http://www.openlinksw.com/blog/~kidehen
Twitter Profile: https://twitter.com/kidehen
Google+ Profile: https://plus.google.com/+KingsleyIdehen/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen
Personal WebID: http://kingsley.idehen.net/dataspace/person/kidehen#this
Attachments
- application/pkcs7-signature attachment: S/MIME Cryptographic Signature
Received on Thursday, 2 October 2014 23:02:28 UTC