Re: owl:sameAs [recipe] from Hugh Glaser on 2009-07-29 (public-lod@w3.org from July 2009)

From: Hugh Glaser <hg@ecs.soton.ac.uk>
Date: Wed, 29 Jul 2009 12:44:14 +0100
To: Kingsley Idehen <kidehen@openlinksw.com>
CC: Eric Lease Morgan <eric_morgan@infomotions.com>, "public-lod@w3.org" <public-lod@w3.org>
Message-ID: <EMEW3|5bdc9abed5d82c1c8e0c7bb4d0b5bc21l6SCiP02hg|ecs.soton.ac.uk|C873%hg@ecs.so>
On 29/07/2009 12:35, "Kingsley Idehen" <kidehen@openlinksw.com> wrote:

> Hugh Glaser wrote:
>> On 28/07/2009 14:46, "Kingsley Idehen" <kidehen@openlinksw.com> wrote:
>> 
>>  
>>> Hugh Glaser wrote:
>>>    
>>>> Good stuff.
>>>> However, I don't think that Named Graphs are the answer.
>>>> I get my Linked Data by resolving URIs over http.
>>>> If I ask your Linked Data Space (I hope that is the right use of your
>>>> terminology) for something like
>>>> curl -H "Accept: application/rdf+xml" http://dbpedia.org/resource/London
>>>> and follow the redirect don't I still get the non-wikipedia data with the
>>>> wikipedia data?
>>>> Or am I not understanding something?
>>>> 
>>>>      
>>> The link chain shouldn't be broken. Named Graphs should have zero impact
>>> on HTTP URIs.
>>>    
>> That is what I thought.
>> So how is the linkage data kept separate when I do URI resolution?
>> Cheers
>>  
> Hugh,
> 
> The Linkage isn't what needs separating.
> 
> Its when you make a data set that is 100% entity to entity links
> triples  (i.e., a linkset or linkbase) that needs separating (as good
> practice) from the main KB.  Remember, there are times when the main KB
> and the source of cross links to external entities are produced by
> separate parties. Thus, the linksets end up in their own Named Graphs.
> Purely for organization and maintenance. This kind of partitioning
> allows the use SPARUL scoped to Named Graphs when fixing triple
> statement errors (e.g. owl:sameAs triples), for instance.
That's great - I think you agree with me. :-)
As I think when you say Named Graph you mean a different URI scheme for
linkage information.
Cheers
Hugh
> 
> Kingsley
> 
> 
> 
>>> I think Alan is saying: put what is best described as a linkbase dump in
>>> a separate Named Graph. Doing this shouldn't break the tapestry inherent
>>> in the HTTP URIs (the data  conductors). We have tons of data in
>>> <http://lod.openlinksw.com> partitioned across named graphs, and none of
>>> that breaks the "follow-your-nose" pattern. Remember, I am a stickler
>>> for keeping the HTTP URIs of entities in full scope of user agents :-)
>>> 
>>> The only time you might have an issue is when performing SPARQL, where
>>> explicitly identifying the Named Graph in the FROM Clause may aid
>>> performance (and even here this depends on the indexing in placece re,
>>> the RDF DBMS insta, these days re. Virtuoso that doesn't even matter
>>> since the default indexing scheme has been changed).
>>> 
>>> Kingsley
>>>    
>>>> Best
>>>> Hugh
>>>> 
>>>> 
>>>> On 28/07/2009 11:17, "Kingsley Idehen" <kidehen@openlinksw.com> wrote:
>>>> 
>>>> Hugh Glaser wrote:
>>>> 
>>>>      
>>>>> For the record ( © Alan!).
>>>>> I consider it bad practice to keep the knowledge about linking in the same
>>>>> KB as the substantive knowledge you are representing.
>>>>> You need two KBs: one for the knowledge you are publishing, and one for
>>>>> the
>>>>> linkage you are working on.
>>>>> These have very different provenance, maintenance patterns, etc..
>>>>> And you can include a link from URIs that you generate to the linkage KB.
>>>>> 
>>>>>   
>>>>>        
>>>> For terminology consolidation purposes, what you call a  KB is  a
>>>> "Linked Data Space" in my parlance :-)
>>>> 
>>>> Yes, the partitioning suggested above is very important. Thus, you need
>>>> purpose specific Linked Data Spaces  (hosing many Named Graphs) if you
>>>> seek to make things a little clearer to data consumers and their agents.
>>>> 
>>>>      
>>>>> In fact, this would then help Alan's problem about sameAs:- he could
>>>>> simply
>>>>> decide not to get your view of the linkage, whereas with sameAs in the
>>>>> resources he has no choice but to accept your view, and even your
>>>>> predicate
>>>>> when he resolves a URI or queries the SPARQL.
>>>>> 
>>>>> And I do agree with you about minting URIs to your local stuff, including
>>>>> authors; it is error-prone to try to re-use things like dbpedia for this,
>>>>> on
>>>>> any scale. And this is why you need to tackle the linkage problem as a
>>>>> separate engineering activity.
>>>>> 
>>>>> Best
>>>>> Hugh
>>>>> 
>>>>> (Of course I do have some software and architecture that supports separate
>>>>> linkage KBs (our CRS) so I would say this, but nevertheless I think it is
>>>>> the correct engineering approach, however it is done. Separation of
>>>>> Concerns.)
>>>>> 
>>>>>   
>>>>>        
>>>> Note, we've partitioned DBpedia in such a way that you now have a Graph
>>>> IRI for each data set within this particular Linked Data Space.
>>>> 
>>>> Kingsley
>>>> 
>>>>      
>>>>> On 28/07/2009 02:23, "Eric Lease Morgan" <eric_morgan@infomotions.com>
>>>>> wrote:
>>>>> 
>>>>> 
>>>>> 
>>>>> On Jul 25, 2009, at 5:09 AM, Bill Roberts wrote:
>>>>> 
>>>>> 
>>>>>   
>>>>>        
>>>>>> Regarding linking to external resources, what it seems you want
>>>>>> to do is to identify the dc:creator of the book, hence say that
>>>>>> the creator is the person whose name was Thomas More. You could
>>>>>> create your own URI and if you are managing a whole bunch of data
>>>>>> about books and authors, then there could be reasons to do that,
>>>>>> but in general if there is a satisfactory existing URI, it is
>>>>>> preferable to use it. Dbpedia seems to have become the de facto
>>>>>> standard...
>>>>>> 
>>>>>>     
>>>>>>          
>>>>> Okay, then how's this for a recipe to create rich linked data of
>>>>> electronic books and authors within my own site as well as to the
>>>>> outside world:
>>>>> 
>>>>>    1. Mint URIs pointing to representations of local etexts
>>>>>    2. Mint URIs pointing to representations of authors of local etexts
>>>>> 
>>>>>    3. In resources of etexts, include owl:sameAs links to DBpedia
>>>>> resources
>>>>>    4. In resources of etexts, point to local URIs of authors
>>>>> 
>>>>>    5. In resources of authors, include owl:sameAs links to DBpedia
>>>>> resources
>>>>>    6. In resources of authors, include owl:creatorOf links to local
>>>>> etexts
>>>>> 
>>>>>    7. For extra credit, do the same thing for subjects/keywords
>>>>> 
>>>>> For example, the following resource descriptions:
>>>>> 
>>>>> <!-- etext #1; points to local author and remote title -->
>>>>> <rdf:RDF
>>>>>    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>>>>>    xmlns:dcterms="http://purl.org/dc/terms/"
>>>>>    xmlns:owl="http://www.w3.org/2002/07/owl#">
>>>>>    <rdf:Description
>>>>>      rdf:about="http://infomotions.com/etexts/id/more-utopia-221"
>>>>>      owl:sameAs="http://dbpedia.org/resource/Utopia_(book)">
>>>>>      <dcterms:title>Utopia</dcterms:title>
>>>>>      <dcterms:creator
>>>>> rdf:resource="http://infomotions.com/etexts/authors/resource/thomas-more
>>>>> " />
>>>>>    </rdf:Description>
>>>>> </rdf:RDF>
>>>>> 
>>>>> 
>>>>> <!-- etext #2; points to local author and remote title -->
>>>>> <rdf:RDF
>>>>>    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>>>>>    xmlns:dcterms="http://purl.org/dc/terms/"
>>>>>    xmlns:owl="http://www.w3.org/2002/07/owl#">
>>>>>    <rdf:Description
>>>>>      rdf:about="http://infomotions.com/etexts/id/more-reality-404"
>>>>>      owl:sameAs="http://dbpedia.org/resource/Reality_(book)">
>>>>>      <dcterms:title>Reality</dcterms:title>
>>>>>      <dcterms:creator
>>>>> rdf:resource="http://infomotions.com/etexts/authors/resource/thomas-more
>>>>> " />
>>>>>    </rdf:Description>
>>>>> </rdf:RDF>
>>>>> 
>>>>> 
>>>>> <!-- author; points to local etexts and remote author -->
>>>>> <rdf:RDF
>>>>>    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>>>>>    xmlns:owl="http://www.w3.org/2002/07/owl#">
>>>>>    <rdf:Description
>>>>>      rdf:about="http://infomotions.com/etexts/authors/resource/thomas-more
>>>>> "
>>>>>      owl:sameAs="http://dbpedia.org/resource/Thomas_More">
>>>>>      <owl:creatorOf
>>>>> rdf:resource="http://infomotions.com/etexts/id/more-utopia-221
>>>>> "/>
>>>>>      <owl:creatorOf
>>>>> rdf:resource="http://infomotions.com/etexts/id/more-reality-404
>>>>> " />
>>>>>    </rdf:Description>
>>>>> </rdf:RDF>
>>>>> 
>>>>> --
>>>>> Eric Lease Morgan
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>>   
>>>>>        
>>>> --
>>>> 
>>>> 
>>>> Regards,
>>>> 
>>>> Kingsley Idehen       Weblog: http://www.openlinksw.com/blog/~kidehen
>>>> President & CEO
>>>> OpenLink Software     Web: http://www.openlinksw.com
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>>      
>>> --
>>> 
>>> 
>>> Regards,
>>> 
>>> Kingsley Idehen       Weblog: http://www.openlinksw.com/blog/~kidehen
>>> President & CEO
>>> OpenLink Software     Web: http://www.openlinksw.com
>>> 
>>> 
>>> 
>>> 
>>> 
>>>    
>> 
>> 
>>  
> 
> 
> --
> 
> 
> Regards,
> 
> Kingsley Idehen       Weblog: http://www.openlinksw.com/blog/~kidehen
> President & CEO
> OpenLink Software     Web: http://www.openlinksw.com
> 
> 
> 
> 
>
Received on Wednesday, 29 July 2009 11:45:14 UTC