- From: Yves Raimond <yves.raimond@gmail.com>
- Date: Sat, 7 Feb 2009 15:18:18 +0000
- To: Hugh Glaser <hg@ecs.soton.ac.uk>
- Cc: "public-lod@w3.org" <public-lod@w3.org>
Hello! On Sat, Feb 7, 2009 at 2:31 PM, Hugh Glaser <hg@ecs.soton.ac.uk> wrote: > Hi Yves, > Thank you for the response. > Yes, you are right - when we have taken over the world, there will be powerful systems to help us do this, and I can be a happy little data provider, while others provide my search and linkage. > But when we try to tell people that we have this wonderful resource called Musicbrainz, which is part of the amazing LOD cloud, (I think I saw evidence of such a talk recently), what experience do the excited listeners get when they go away and try to join? > After quite a lot of work they will have concluded, at best, that this is system infrastructure for gurus, and so they can do a bit of browsing a bit like wikipedia but not as pleasant, and it is not relevant to them. > I have just failed to find Telemann on Musicbrainz, I'm afraid, (musicbrainz.org or Sindice) although I only spent a few minutes - but why so hard? Just typing "telemann musicbrainz" in Google led me directly to: http://musicbrainz.org/artist/8f831f50-e409-47c3-8598-71a61bc8cfb3 I don't consider that as particularly hard! > Perhaps all I wanted to do was use his URI to identify him unambiguously, using a little tool that lets me say I (dis)like his music, but it is just so hard. > OK, maybe my sort of use case is not what the community cares about - so be it, but I think I should be able to do it, and do it now. > These sort of links are really valuable - there might not be so many of them, but they can carry a lot of information. > I can tell you we have over 1M links to the dblp world from rkbexplorer, but since the data is substantially the same, I don't consider them as valuable. > On the other hand, we have 174 links from nsf to cordis and 183 the other way - now that is value. How did we create them? By a lot of work, and the ability to search. > > So I agree in principle with your view of separating out these things. > But I don't think we have the time, and while we fail to deliver this, possible recruits are turning away. > Is all this publishing work to founder because the Sindice team is not big enough to cope, or no-one seems to be building the linkage systems, all because the data providers do not want to offer a simple search facility? > On a side-note, there are at least three interlinkage systems I know of (Georgi's, LinkedMDB's and mine). Most dataset provide a SPARQL end-point allowing to make such specific-dataset-to-specific dataset linkage easy enough. Having a SPARQL interface makes interlinking *much* more reliable, because you know exactly what happens. If you provide me with a simple text search, I won't have any clue how your inner searching process works (are you retrieving all resources which label matches the search term? are you building an index on neighboring literals?), and I won't be able to draw satisfying interlinking conclusion. Best, y > Best > Hugh > > By the way, I am not suggesting that any identifiers such as GUIDs or PIDs should be read by humans - more the opposite. My agent should be able to find them easily and then ask me if that was what I meant, using words. > > On 07/02/2009 13:39, "Yves Raimond" <yves.raimond@gmail.com> wrote: > > > I think this is a really dangerous idea. Most "web-scale" identifiers, > eg Musicbrainz GUIDs and BBC PIDs are not human readable (for a lot of > reasons, and mainly because human-readable identifiers are not unique > enough!!), but both provide really easy-to-use lookup service. > Such lookups, for other sites, can be provided by semantic web search > engines. It is exactly as in the document web: web identifiers are > mostly opaque, but search engines are here to provide the help needed. > > So my proposal is: let's not confuse everything. Some people's job is > to make datasets available out there and as linked as possible to > others. Some other people make lookup services (eg Sindice), and I > think this separation of concerns works quite well. > > Best, > y > > >
Received on Saturday, 7 February 2009 15:18:52 UTC