- From: Kingsley Idehen <kidehen@openlinksw.com>
- Date: Thu, 04 Jul 2013 12:45:46 -0400
- To: public-lod@w3.org
- Message-ID: <51D5A6BA.7060704@openlinksw.com>
On 7/4/13 11:49 AM, Olivier Berger wrote: > Hi. > > I hope such "design pattern" questions on consuming Linked Open Data are > OT... otherwise, please suggest an appropriate venue for questions ;) > > > I'm trying to figure out potential patterns for designing an application > /consuming/ Linked Data, typically using SPARQL over a local Virtuoso > triple store which was fed with harvested Linked Open Data. > > I happen to find resources sometimes identified with http, sometimes > with https, which otherwise reference the same URL. Other issues may be > the use or not of a trailing slash for dir-like URLs. > > For instance, I'd like to match as "identical" two doap:Projects resources > which have "same" doap:homepage if I can match > http://project1/example.com/home/ and https://project1/example.com/home/ > > > It may happen that a document is rendered the same by the publishing > service, whichever way it is accessed, so I'd like to consider that > referencing it via URIs which contain htpp:// or https:// is equivalent. > > Or a service may have chosen to adopt https:// as a canonical URI for > instance, but it may happen that users reference it via http somewhere > else... > > Obviously, direct matching of the same ?h URIRef won't work > in basic SPARQL queries like : > PREFIX doap: <http://usefulinc.com/ns/doap#> > > SELECT * > { > GRAPH <htpp://myapp.example.com/graphs?source=http://publisher1.example.com/> { > ?dp doap:homepage ?h. > ?dp doap:name ?dn > } > GRAPH <htpp://myapp.example.com/graphs?source=https://publisher2.example.com/> { > ?ap doap:homepage ?h. > ?ap doap:name ?an > } > } > > I can think of a sort of Regexp matching on the string after '://' but I > doubt to get good performance ;-) > > Is there a way to create indexes over some URIs, or owl:sameAs relations to > manage such URI matching in queries ? Or am I left to "normalizing" my > URLs in the harvested data before storing them in the triple store ? > > Would you think there's a reasonably standard approach... or one that > would work with Virtuoso 6.1.3 ? ;) > > I imagine that this is a kinda FAQ for consuming Linked (Open) > Data... but it seems many more people are concerned on publishing than > on consuming in public discussions ;-) > > > Thanks in advance. > > P.S.: already posted a similar question on > http://answers.semanticweb.com/questions/23584/matching-ressources-with-variying-url-scheme-http-https This is an example of what I mean by *explicit* entity relationship semantics that RDF uniquely brings to the table re. enhancements to the basic EAV/CR model and Linked Data. At this juncture, you are dealing with basic structured data and (at best) *implicit* rather than *explicit* machine- and human-comprehensible entity relationship semantics. Situation: You have the relation doap:homepage, but its semantics aren't clear to you or your user agent. Now, let's leverage some basic RDF and Linked Data to look-up the semantics of the doap:homepage relation and we find: 1. http://lod.openlinksw.com/describe/?url=http%3A%2F%2Fusefulinc.com%2Fns%2Fdoap%23homepage&graph=http%3A%2F%2Fschemapedia.com%2Fschemas%2F -- its an inversFunctionalProperty (IFP) 2. http://lod.openlinksw.com/describe/?url=http%3A%2F%2Fwww.w3.org%2F2002%2F07%2Fowl%23InverseFunctionalProperty&graph=http%3A%2F%2Fschemapedia.com%2Fschemas%2F -- inverseFunctional property description (a little sparse). Using relationship semantics "reasoning" and "inference" an RDF processor can determine that the subjects (irrespective of how they are denoted/named) of the doap:homepage relation share a common referent. I also posted an IFP exploitation example using SPARQL a while back [1]. Conclusion: just leverage RDF semantics, forget about regexing anything, and you have a first-class demonstration of what RDF actually adds to Linked Data :-) Links: [1] http://bit.ly/Y6TIfs -- Using SPARQL to Integrate Disparate Data via InverseFunctionalProperty (IFP) relations . -- Regards, Kingsley Idehen Founder & CEO OpenLink Software Company Web: http://www.openlinksw.com Personal Weblog: http://www.openlinksw.com/blog/~kidehen Twitter/Identi.ca handle: @kidehen Google+ Profile: https://plus.google.com/112399767740508618350/about LinkedIn Profile: http://www.linkedin.com/in/kidehen
Attachments
- application/pkcs7-signature attachment: S/MIME Cryptographic Signature
Received on Thursday, 4 July 2013 16:46:09 UTC