- From: Hugh Glaser <hg@ecs.soton.ac.uk>
- Date: Tue, 9 Mar 2010 11:37:16 +0000
- To: Bernhard Schandl <bernhard.schandl@univie.ac.at>, Peter Ansell <ansell.peter@gmail.com>
- CC: Kingsley Idehen <kidehen@openlinksw.com>, Aldo Bucchi <aldo.bucchi@gmail.com>, Linked Data community <public-lod@w3.org>
I have found this a very interesting discussion, thinking about the Linked Data World at large as well as what others think - thanks. Sorry this moved away from the important discussion about how to identify people, both as a technical and a socio issue - my fault. On 09/03/2010 09:12, "Bernhard Schandl" <bernhard.schandl@univie.ac.at> wrote: > Peter, > >> It is a good thing that the subject URI is an HTTP URI available from >> your server but that is only the start of the story. The rest of the >> story needs other servers to give your data more context. >> >>>> In your example the fact that there >>>> is a link can only be figured out using some external service that >>>> knows about both data sources. >>> >>> Sure. Before I can add a link to any data set, I have to detect it using >>> some heuristics. Shared URN/DOI/... identifiers seem a valid approach for >>> this -- think of ISBN numbers. >> >> Sharing identifiers is a good idea, but it isn't Linked Data as yet... > > I'm talking of the *preconditions* for linking data, based on shared > identifiers. And once I have these identifiers, why not publish them alongside > the dereferenceable URIs. Being able to work out what a dereferenceable URI means is indeed a pre-condition for linking data, and also in the Linked Data, this is achieved by dereferencing and examining the RDF returned. And finding a URN, doi, isbn, mailto, etc. is a very good way of communicating that information. However, for me in the Linked Data world, such URIs are no more an *identifier* than "Hugh Glaser", or the title of a book, (or even the URL of one of my homepages) simply because the access mechanism is unclear, and even if I do try to look it up I am unlikely to get RDF (at least at present). They are more useful in general, of course, because they are less likely to be ambiguous, but it is only a matter of degree. > >>>> If your server was Linked Data and not >>>> just an HTTP URI based RDF database then it would link out using HTTP >>>> URI's and both servers could be directly explored without some >>>> external service. >>> >>> Once the link has been detected, I can of course add it to both data sets. >>> Well, the owner of the datasets can. >> >> This is Linked Data, when the dataset owners discover the mutual >> references and link out from their HTTP URI's to the other datasets >> HTTP URI's. > > Why only the dataset owners? A third party that is aware of both data sets is > enabled to discover these links, too. I agree entirely, although the dataset owner is in a prime position to seed the activity, and also may have other implicit knowledge that is useful to help to get the links right. > >> It was enabled by sharing the property, and then having >> others discover it. Just sharing the URN property isn't Linked Data as >> people have no way of resolving the URN that is referenced to more >> information. > > Again, it's a precondition to link data. > >> It could also have been shared in another way using Inverse Functional >> Properties (IFP) so that the URN scheme need not have been created. > > The URN schema for ISBN already exists [1], and several others exist (e.g., > SWIFT [2]), why should we throw them away? > > [1] <http://www.faqs.org/rfcs/rfc3187.html> > [2] <http://www.faqs.org/rfcs/rfc3615.html> > >> There is no automatic HTTP based way of knowing which datasets may >> have relevant links in either case, > > One could use indices to find other occurrences of the same URN. When they are > linked via owl:sameAs, the linking can be fully automatized. > >> so serving up the statements on >> your dataset is very useful for discovery, I wasn't meaning to say >> that was a bad thing. Just emphasising the full story for Linked Data. > > I got that :-) > > My point is simply that not *every* URI in a Linked Data context needs to be > dereferenceable. When there are established URN schemes in place (like it is > the case for ISBN numbers), why not reuse them instead of packing them in a > literal (is there a datatype for ISBN numbers?) and publish them to simplify > linking for others? This seems to make more sense to me than only relying on > URN-to-HTTPURI mappings, which I can still do, as long as I publish the > "original" identifier in its "native" URN form. I have a feeling that the issue here may be the same as how to represent the address of someone's pure html home page in RDF. It is a URL and hence a URI. But it is not dereferenceable to RDF. A purist might say that it is not a Linked Data URI (doesn't return RDF), and so should be a string, hopefully with a useful type on it. But for others it is a resource, and so can comfortably be a URI in RDF. And having it as a resource enables it to be used in a more convenient way for the sort of thing that we are discussing. So dereferencing one of your Linked Data URIs will return some RDF that has resources (URIs) that are not dereferenceable to RDF. And these will be very helpful to people/agents who are trying to add linking to the world. Hopefully that is sufficiently closely related to your comments to make sense? And I am pleased to agree, although I might lean more to the purist side :-) By the way, in the original question, there seemed to be a suggestion which I guess I misunderstood, that an RDF store that effectively only published non-dereferenceable URIs, and accessed as a query service, was in some sense doing Linked Data. I would have found that very hard to agree with. Best Hugh > > Best > Bernhard >
Received on Tuesday, 9 March 2010 11:38:17 UTC