W3C home > Mailing lists > Public > public-lod@w3.org > July 2009

Re: owl:sameAs [recipe]

From: Pat Hayes <phayes@ihmc.us>
Date: Wed, 29 Jul 2009 11:28:46 -0500
Cc: Hugh Glaser <hg@ecs.soton.ac.uk>, Eric Lease Morgan <eric_morgan@infomotions.com>, "public-lod@w3.org" <public-lod@w3.org>
Message-Id: <83CA2094-A09E-4CD5-B804-72B02F03B4F9@ihmc.us>
To: Kingsley Idehen <kidehen@openlinksw.com>

On Jul 29, 2009, at 9:41 AM, Kingsley Idehen wrote:

> Hugh Glaser wrote:
>>
>> On 29/07/2009 12:35, "Kingsley Idehen" <kidehen@openlinksw.com>  
>> wrote:
>>
>>
>>> Hugh Glaser wrote:
>>>
>>>> On 28/07/2009 14:46, "Kingsley Idehen" <kidehen@openlinksw.com>  
>>>> wrote:
>>>>
>>>>
>>>>> Hugh Glaser wrote:
>>>>>
>>>>>> Good stuff.
>>>>>> However, I don't think that Named Graphs are the answer.
>>>>>> I get my Linked Data by resolving URIs over http.
>>>>>> If I ask your Linked Data Space (I hope that is the right use  
>>>>>> of your
>>>>>> terminology) for something like
>>>>>> curl -H "Accept: application/rdf+xml" http://dbpedia.org/resource/London
>>>>>> and follow the redirect don't I still get the non-wikipedia  
>>>>>> data with the
>>>>>> wikipedia data?
>>>>>> Or am I not understanding something?
>>>>>>
>>>>>>
>>>>> The link chain shouldn't be broken. Named Graphs should have  
>>>>> zero impact
>>>>> on HTTP URIs.
>>>>>
>>>> That is what I thought.
>>>> So how is the linkage data kept separate when I do URI resolution?
>>>> Cheers
>>>>
>>> Hugh,
>>>
>>> The Linkage isn't what needs separating.
>>>
>>> Its when you make a data set that is 100% entity to entity links
>>> triples  (i.e., a linkset or linkbase) that needs separating (as  
>>> good
>>> practice) from the main KB.  Remember, there are times when the  
>>> main KB
>>> and the source of cross links to external entities are produced by
>>> separate parties. Thus, the linksets end up in their own Named  
>>> Graphs.
>>> Purely for organization and maintenance. This kind of partitioning
>>> allows the use SPARUL scoped to Named Graphs when fixing triple
>>> statement errors (e.g. owl:sameAs triples), for instance.
>>>
>> That's great - I think you agree with me. :-)
>>
> Yes.
>> As I think when you say Named Graph you mean a different URI scheme  
>> for
>> linkage information.
>>
> No,   I mean a Named Collection of Triples :-)

An RDF graph *is* a set of triples, and a set *is* a collection, so...

Pat Hayes

>
> Kingsley
>> Cheers
>> Hugh
>>
>>> Kingsley
>>>
>>>
>>>
>>>
>>>>> I think Alan is saying: put what is best described as a linkbase  
>>>>> dump in
>>>>> a separate Named Graph. Doing this shouldn't break the tapestry  
>>>>> inherent
>>>>> in the HTTP URIs (the data  conductors). We have tons of data in
>>>>> <http://lod.openlinksw.com> partitioned across named graphs, and  
>>>>> none of
>>>>> that breaks the "follow-your-nose" pattern. Remember, I am a  
>>>>> stickler
>>>>> for keeping the HTTP URIs of entities in full scope of user  
>>>>> agents :-)
>>>>>
>>>>> The only time you might have an issue is when performing SPARQL,  
>>>>> where
>>>>> explicitly identifying the Named Graph in the FROM Clause may aid
>>>>> performance (and even here this depends on the indexing in  
>>>>> placece re,
>>>>> the RDF DBMS insta, these days re. Virtuoso that doesn't even  
>>>>> matter
>>>>> since the default indexing scheme has been changed).
>>>>>
>>>>> Kingsley
>>>>>
>>>>>> Best
>>>>>> Hugh
>>>>>>
>>>>>>
>>>>>> On 28/07/2009 11:17, "Kingsley Idehen" <kidehen@openlinksw.com>  
>>>>>> wrote:
>>>>>>
>>>>>> Hugh Glaser wrote:
>>>>>>
>>>>>>
>>>>>>> For the record (  Alan!).
>>>>>>> I consider it bad practice to keep the knowledge about linking  
>>>>>>> in the same
>>>>>>> KB as the substantive knowledge you are representing.
>>>>>>> You need two KBs: one for the knowledge you are publishing,  
>>>>>>> and one for
>>>>>>> the
>>>>>>> linkage you are working on.
>>>>>>> These have very different provenance, maintenance patterns,  
>>>>>>> etc..
>>>>>>> And you can include a link from URIs that you generate to the  
>>>>>>> linkage KB.
>>>>>>>
>>>>>>>
>>>>>> For terminology consolidation purposes, what you call a  KB is  a
>>>>>> "Linked Data Space" in my parlance :-)
>>>>>>
>>>>>> Yes, the partitioning suggested above is very important. Thus,  
>>>>>> you need
>>>>>> purpose specific Linked Data Spaces  (hosing many Named Graphs)  
>>>>>> if you
>>>>>> seek to make things a little clearer to data consumers and  
>>>>>> their agents.
>>>>>>
>>>>>>
>>>>>>> In fact, this would then help Alan's problem about sameAs:- he  
>>>>>>> could
>>>>>>> simply
>>>>>>> decide not to get your view of the linkage, whereas with  
>>>>>>> sameAs in the
>>>>>>> resources he has no choice but to accept your view, and even  
>>>>>>> your
>>>>>>> predicate
>>>>>>> when he resolves a URI or queries the SPARQL.
>>>>>>>
>>>>>>> And I do agree with you about minting URIs to your local  
>>>>>>> stuff, including
>>>>>>> authors; it is error-prone to try to re-use things like  
>>>>>>> dbpedia for this,
>>>>>>> on
>>>>>>> any scale. And this is why you need to tackle the linkage  
>>>>>>> problem as a
>>>>>>> separate engineering activity.
>>>>>>>
>>>>>>> Best
>>>>>>> Hugh
>>>>>>>
>>>>>>> (Of course I do have some software and architecture that  
>>>>>>> supports separate
>>>>>>> linkage KBs (our CRS) so I would say this, but nevertheless I  
>>>>>>> think it is
>>>>>>> the correct engineering approach, however it is done.  
>>>>>>> Separation of
>>>>>>> Concerns.)
>>>>>>>
>>>>>>>
>>>>>> Note, we've partitioned DBpedia in such a way that you now have  
>>>>>> a Graph
>>>>>> IRI for each data set within this particular Linked Data Space.
>>>>>>
>>>>>> Kingsley
>>>>>>
>>>>>>
>>>>>>> On 28/07/2009 02:23, "Eric Lease Morgan" <eric_morgan@infomotions.com 
>>>>>>> >
>>>>>>> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Jul 25, 2009, at 5:09 AM, Bill Roberts wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> Regarding linking to external resources, what it seems you want
>>>>>>>> to do is to identify the dc:creator of the book, hence say that
>>>>>>>> the creator is the person whose name was Thomas More. You could
>>>>>>>> create your own URI and if you are managing a whole bunch of  
>>>>>>>> data
>>>>>>>> about books and authors, then there could be reasons to do  
>>>>>>>> that,
>>>>>>>> but in general if there is a satisfactory existing URI, it is
>>>>>>>> preferable to use it. Dbpedia seems to have become the de facto
>>>>>>>> standard...
>>>>>>>>
>>>>>>>>
>>>>>>> Okay, then how's this for a recipe to create rich linked data of
>>>>>>> electronic books and authors within my own site as well as to  
>>>>>>> the
>>>>>>> outside world:
>>>>>>>
>>>>>>>   1. Mint URIs pointing to representations of local etexts
>>>>>>>   2. Mint URIs pointing to representations of authors of local  
>>>>>>> etexts
>>>>>>>
>>>>>>>   3. In resources of etexts, include owl:sameAs links to DBpedia
>>>>>>> resources
>>>>>>>   4. In resources of etexts, point to local URIs of authors
>>>>>>>
>>>>>>>   5. In resources of authors, include owl:sameAs links to  
>>>>>>> DBpedia
>>>>>>> resources
>>>>>>>   6. In resources of authors, include owl:creatorOf links to  
>>>>>>> local
>>>>>>> etexts
>>>>>>>
>>>>>>>   7. For extra credit, do the same thing for subjects/keywords
>>>>>>>
>>>>>>> For example, the following resource descriptions:
>>>>>>>
>>>>>>> <!-- etext #1; points to local author and remote title -->
>>>>>>> <rdf:RDF
>>>>>>>   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>>>>>>>   xmlns:dcterms="http://purl.org/dc/terms/"
>>>>>>>   xmlns:owl="http://www.w3.org/2002/07/owl#">
>>>>>>>   <rdf:Description
>>>>>>>     rdf:about="http://infomotions.com/etexts/id/more-utopia-221"
>>>>>>>     owl:sameAs="http://dbpedia.org/resource/Utopia_(book)">
>>>>>>>     <dcterms:title>Utopia</dcterms:title>
>>>>>>>     <dcterms:creator
>>>>>>> rdf:resource="http://infomotions.com/etexts/authors/resource/thomas-more
>>>>>>> " />
>>>>>>>   </rdf:Description>
>>>>>>> </rdf:RDF>
>>>>>>>
>>>>>>>
>>>>>>> <!-- etext #2; points to local author and remote title -->
>>>>>>> <rdf:RDF
>>>>>>>   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>>>>>>>   xmlns:dcterms="http://purl.org/dc/terms/"
>>>>>>>   xmlns:owl="http://www.w3.org/2002/07/owl#">
>>>>>>>   <rdf:Description
>>>>>>>     rdf:about="http://infomotions.com/etexts/id/more- 
>>>>>>> reality-404"
>>>>>>>     owl:sameAs="http://dbpedia.org/resource/Reality_(book)">
>>>>>>>     <dcterms:title>Reality</dcterms:title>
>>>>>>>     <dcterms:creator
>>>>>>> rdf:resource="http://infomotions.com/etexts/authors/resource/thomas-more
>>>>>>> " />
>>>>>>>   </rdf:Description>
>>>>>>> </rdf:RDF>
>>>>>>>
>>>>>>>
>>>>>>> <!-- author; points to local etexts and remote author -->
>>>>>>> <rdf:RDF
>>>>>>>   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>>>>>>>   xmlns:owl="http://www.w3.org/2002/07/owl#">
>>>>>>>   <rdf:Description
>>>>>>>     rdf:about="http://infomotions.com/etexts/authors/resource/thomas-more
>>>>>>> "
>>>>>>>     owl:sameAs="http://dbpedia.org/resource/Thomas_More">
>>>>>>>     <owl:creatorOf
>>>>>>> rdf:resource="http://infomotions.com/etexts/id/more-utopia-221
>>>>>>> "/>
>>>>>>>     <owl:creatorOf
>>>>>>> rdf:resource="http://infomotions.com/etexts/id/more-reality-404
>>>>>>> " />
>>>>>>>   </rdf:Description>
>>>>>>> </rdf:RDF>
>>>>>>>
>>>>>>> --
>>>>>>> Eric Lease Morgan
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> --
>>>>>>
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Kingsley Idehen       Weblog: http://www.openlinksw.com/blog/ 
>>>>>> ~kidehen
>>>>>> President & CEO
>>>>>> OpenLink Software     Web: http://www.openlinksw.com
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>> --
>>>>>
>>>>>
>>>>> Regards,
>>>>>
>>>>> Kingsley Idehen       Weblog: http://www.openlinksw.com/blog/ 
>>>>> ~kidehen
>>>>> President & CEO
>>>>> OpenLink Software     Web: http://www.openlinksw.com
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>> --
>>>
>>>
>>> Regards,
>>>
>>> Kingsley Idehen       Weblog: http://www.openlinksw.com/blog/ 
>>> ~kidehen
>>> President & CEO
>>> OpenLink Software     Web: http://www.openlinksw.com
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>>
>
>
> -- 
>
>
> Regards,
>
> Kingsley Idehen	      Weblog: http://www.openlinksw.com/blog/~kidehen
> President & CEO OpenLink Software     Web: http://www.openlinksw.com
>
>
>
>
>
>
>

------------------------------------------------------------
IHMC                                     (850)434 8903 or (650)494 3973
40 South Alcaniz St.           (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Wednesday, 29 July 2009 16:29:39 UTC

This archive was generated by hypermail 2.4.0 : Thursday, 24 March 2022 20:29:44 UTC