Re: owl:sameAs [recipe] from Kingsley Idehen on 2009-07-29 (public-lod@w3.org from July 2009)

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Wed, 29 Jul 2009 15:41:11 +0100
To: Hugh Glaser <hg@ecs.soton.ac.uk>
CC: Eric Lease Morgan <eric_morgan@infomotions.com>, "public-lod@w3.org" <public-lod@w3.org>
Message-ID: <4A705F87.4010508@openlinksw.com>
Hugh Glaser wrote:
>
> On 29/07/2009 12:35, "Kingsley Idehen" <kidehen@openlinksw.com> wrote:
>
>   
>> Hugh Glaser wrote:
>>     
>>> On 28/07/2009 14:46, "Kingsley Idehen" <kidehen@openlinksw.com> wrote:
>>>
>>>  
>>>       
>>>> Hugh Glaser wrote:
>>>>    
>>>>         
>>>>> Good stuff.
>>>>> However, I don't think that Named Graphs are the answer.
>>>>> I get my Linked Data by resolving URIs over http.
>>>>> If I ask your Linked Data Space (I hope that is the right use of your
>>>>> terminology) for something like
>>>>> curl -H "Accept: application/rdf+xml" http://dbpedia.org/resource/London
>>>>> and follow the redirect don't I still get the non-wikipedia data with the
>>>>> wikipedia data?
>>>>> Or am I not understanding something?
>>>>>
>>>>>      
>>>>>           
>>>> The link chain shouldn't be broken. Named Graphs should have zero impact
>>>> on HTTP URIs.
>>>>    
>>>>         
>>> That is what I thought.
>>> So how is the linkage data kept separate when I do URI resolution?
>>> Cheers
>>>  
>>>       
>> Hugh,
>>
>> The Linkage isn't what needs separating.
>>
>> Its when you make a data set that is 100% entity to entity links
>> triples  (i.e., a linkset or linkbase) that needs separating (as good
>> practice) from the main KB.  Remember, there are times when the main KB
>> and the source of cross links to external entities are produced by
>> separate parties. Thus, the linksets end up in their own Named Graphs.
>> Purely for organization and maintenance. This kind of partitioning
>> allows the use SPARUL scoped to Named Graphs when fixing triple
>> statement errors (e.g. owl:sameAs triples), for instance.
>>     
> That's great - I think you agree with me. :-)
>   
Yes.
> As I think when you say Named Graph you mean a different URI scheme for
> linkage information.
>   
No,   I mean a Named Collection of Triples :-)

Kingsley
> Cheers
> Hugh
>   
>> Kingsley
>>
>>
>>
>>     
>>>> I think Alan is saying: put what is best described as a linkbase dump in
>>>> a separate Named Graph. Doing this shouldn't break the tapestry inherent
>>>> in the HTTP URIs (the data  conductors). We have tons of data in
>>>> <http://lod.openlinksw.com> partitioned across named graphs, and none of
>>>> that breaks the "follow-your-nose" pattern. Remember, I am a stickler
>>>> for keeping the HTTP URIs of entities in full scope of user agents :-)
>>>>
>>>> The only time you might have an issue is when performing SPARQL, where
>>>> explicitly identifying the Named Graph in the FROM Clause may aid
>>>> performance (and even here this depends on the indexing in placece re,
>>>> the RDF DBMS insta, these days re. Virtuoso that doesn't even matter
>>>> since the default indexing scheme has been changed).
>>>>
>>>> Kingsley
>>>>    
>>>>         
>>>>> Best
>>>>> Hugh
>>>>>
>>>>>
>>>>> On 28/07/2009 11:17, "Kingsley Idehen" <kidehen@openlinksw.com> wrote:
>>>>>
>>>>> Hugh Glaser wrote:
>>>>>
>>>>>      
>>>>>           
>>>>>> For the record ( © Alan!).
>>>>>> I consider it bad practice to keep the knowledge about linking in the same
>>>>>> KB as the substantive knowledge you are representing.
>>>>>> You need two KBs: one for the knowledge you are publishing, and one for
>>>>>> the
>>>>>> linkage you are working on.
>>>>>> These have very different provenance, maintenance patterns, etc..
>>>>>> And you can include a link from URIs that you generate to the linkage KB.
>>>>>>
>>>>>>   
>>>>>>        
>>>>>>             
>>>>> For terminology consolidation purposes, what you call a  KB is  a
>>>>> "Linked Data Space" in my parlance :-)
>>>>>
>>>>> Yes, the partitioning suggested above is very important. Thus, you need
>>>>> purpose specific Linked Data Spaces  (hosing many Named Graphs) if you
>>>>> seek to make things a little clearer to data consumers and their agents.
>>>>>
>>>>>      
>>>>>           
>>>>>> In fact, this would then help Alan's problem about sameAs:- he could
>>>>>> simply
>>>>>> decide not to get your view of the linkage, whereas with sameAs in the
>>>>>> resources he has no choice but to accept your view, and even your
>>>>>> predicate
>>>>>> when he resolves a URI or queries the SPARQL.
>>>>>>
>>>>>> And I do agree with you about minting URIs to your local stuff, including
>>>>>> authors; it is error-prone to try to re-use things like dbpedia for this,
>>>>>> on
>>>>>> any scale. And this is why you need to tackle the linkage problem as a
>>>>>> separate engineering activity.
>>>>>>
>>>>>> Best
>>>>>> Hugh
>>>>>>
>>>>>> (Of course I do have some software and architecture that supports separate
>>>>>> linkage KBs (our CRS) so I would say this, but nevertheless I think it is
>>>>>> the correct engineering approach, however it is done. Separation of
>>>>>> Concerns.)
>>>>>>
>>>>>>   
>>>>>>        
>>>>>>             
>>>>> Note, we've partitioned DBpedia in such a way that you now have a Graph
>>>>> IRI for each data set within this particular Linked Data Space.
>>>>>
>>>>> Kingsley
>>>>>
>>>>>      
>>>>>           
>>>>>> On 28/07/2009 02:23, "Eric Lease Morgan" <eric_morgan@infomotions.com>
>>>>>> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Jul 25, 2009, at 5:09 AM, Bill Roberts wrote:
>>>>>>
>>>>>>
>>>>>>   
>>>>>>        
>>>>>>             
>>>>>>> Regarding linking to external resources, what it seems you want
>>>>>>> to do is to identify the dc:creator of the book, hence say that
>>>>>>> the creator is the person whose name was Thomas More. You could
>>>>>>> create your own URI and if you are managing a whole bunch of data
>>>>>>> about books and authors, then there could be reasons to do that,
>>>>>>> but in general if there is a satisfactory existing URI, it is
>>>>>>> preferable to use it. Dbpedia seems to have become the de facto
>>>>>>> standard...
>>>>>>>
>>>>>>>     
>>>>>>>          
>>>>>>>               
>>>>>> Okay, then how's this for a recipe to create rich linked data of
>>>>>> electronic books and authors within my own site as well as to the
>>>>>> outside world:
>>>>>>
>>>>>>    1. Mint URIs pointing to representations of local etexts
>>>>>>    2. Mint URIs pointing to representations of authors of local etexts
>>>>>>
>>>>>>    3. In resources of etexts, include owl:sameAs links to DBpedia
>>>>>> resources
>>>>>>    4. In resources of etexts, point to local URIs of authors
>>>>>>
>>>>>>    5. In resources of authors, include owl:sameAs links to DBpedia
>>>>>> resources
>>>>>>    6. In resources of authors, include owl:creatorOf links to local
>>>>>> etexts
>>>>>>
>>>>>>    7. For extra credit, do the same thing for subjects/keywords
>>>>>>
>>>>>> For example, the following resource descriptions:
>>>>>>
>>>>>> <!-- etext #1; points to local author and remote title -->
>>>>>> <rdf:RDF
>>>>>>    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>>>>>>    xmlns:dcterms="http://purl.org/dc/terms/"
>>>>>>    xmlns:owl="http://www.w3.org/2002/07/owl#">
>>>>>>    <rdf:Description
>>>>>>      rdf:about="http://infomotions.com/etexts/id/more-utopia-221"
>>>>>>      owl:sameAs="http://dbpedia.org/resource/Utopia_(book)">
>>>>>>      <dcterms:title>Utopia</dcterms:title>
>>>>>>      <dcterms:creator
>>>>>> rdf:resource="http://infomotions.com/etexts/authors/resource/thomas-more
>>>>>> " />
>>>>>>    </rdf:Description>
>>>>>> </rdf:RDF>
>>>>>>
>>>>>>
>>>>>> <!-- etext #2; points to local author and remote title -->
>>>>>> <rdf:RDF
>>>>>>    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>>>>>>    xmlns:dcterms="http://purl.org/dc/terms/"
>>>>>>    xmlns:owl="http://www.w3.org/2002/07/owl#">
>>>>>>    <rdf:Description
>>>>>>      rdf:about="http://infomotions.com/etexts/id/more-reality-404"
>>>>>>      owl:sameAs="http://dbpedia.org/resource/Reality_(book)">
>>>>>>      <dcterms:title>Reality</dcterms:title>
>>>>>>      <dcterms:creator
>>>>>> rdf:resource="http://infomotions.com/etexts/authors/resource/thomas-more
>>>>>> " />
>>>>>>    </rdf:Description>
>>>>>> </rdf:RDF>
>>>>>>
>>>>>>
>>>>>> <!-- author; points to local etexts and remote author -->
>>>>>> <rdf:RDF
>>>>>>    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>>>>>>    xmlns:owl="http://www.w3.org/2002/07/owl#">
>>>>>>    <rdf:Description
>>>>>>      rdf:about="http://infomotions.com/etexts/authors/resource/thomas-more
>>>>>> "
>>>>>>      owl:sameAs="http://dbpedia.org/resource/Thomas_More">
>>>>>>      <owl:creatorOf
>>>>>> rdf:resource="http://infomotions.com/etexts/id/more-utopia-221
>>>>>> "/>
>>>>>>      <owl:creatorOf
>>>>>> rdf:resource="http://infomotions.com/etexts/id/more-reality-404
>>>>>> " />
>>>>>>    </rdf:Description>
>>>>>> </rdf:RDF>
>>>>>>
>>>>>> --
>>>>>> Eric Lease Morgan
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>   
>>>>>>        
>>>>>>             
>>>>> --
>>>>>
>>>>>
>>>>> Regards,
>>>>>
>>>>> Kingsley Idehen       Weblog: http://www.openlinksw.com/blog/~kidehen
>>>>> President & CEO
>>>>> OpenLink Software     Web: http://www.openlinksw.com
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>      
>>>>>           
>>>> --
>>>>
>>>>
>>>> Regards,
>>>>
>>>> Kingsley Idehen       Weblog: http://www.openlinksw.com/blog/~kidehen
>>>> President & CEO
>>>> OpenLink Software     Web: http://www.openlinksw.com
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>    
>>>>         
>>>  
>>>       
>> --
>>
>>
>> Regards,
>>
>> Kingsley Idehen       Weblog: http://www.openlinksw.com/blog/~kidehen
>> President & CEO
>> OpenLink Software     Web: http://www.openlinksw.com
>>
>>
>>
>>
>>
>>     
>
>
>   


-- 


Regards,

Kingsley Idehen	      Weblog: http://www.openlinksw.com/blog/~kidehen
President & CEO 
OpenLink Software     Web: http://www.openlinksw.com
Received on Wednesday, 29 July 2009 14:41:56 UTC