- From: Kingsley Idehen <kidehen@openlinksw.com>
- Date: Sat, 14 Feb 2009 14:27:20 -0500
- To: Richard Cyganiak <richard@cyganiak.de>
- CC: Hugh Glaser <hg@ecs.soton.ac.uk>, "Hausenblas, Michael" <michael.hausenblas@deri.org>, Bernhard Haslhofer <bernhard.haslhofer@univie.ac.at>, Linked Data community <public-lod@w3.org>
Richard Cyganiak wrote: > > On 14 Feb 2009, at 15:59, Hugh Glaser wrote: >> Now I think about it, I have checked what dbpedia does to >> http://dbpedia.org/resource/Esperanta it does the blank doc thing. >> (I guess we need to work out what is best practice for this and then >> add it >> to the How to Publish? I think my view is that something like >> http://dbpedia.org/data/Esperanta.rdf should 404.) > > FWIW, DBpedia does a bit of 404ing: > > http://dbpedia.org/page/Esperanta is an empty HTML document > http://dbpedia.org/data/Esperanta is 404 > http://dbpedia.org/data/Esperanta.rdf is an empty RDF document > > These should all 404, and at least the first one used to on the > previous incarnation of the DBpedia server software. Richard, We'll deal with it. It can 404 or smartly do something like: http://dbpedia.org/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&should-sponge=&query=select+distinct+*+where+{%3Fs+%3Fp+%3Fo.+%3Fo+bif%3Acontains+%22Esperanta%22}&format=text%2Fhtml&debug=on Make a suggestion doc on the fly. > > Richard > > >> >> So either way, in LOD sites of the sort that have DBs or KBs behind >> them, >> either it is not possible to get a 404 (dbpedia), or you canšt >> distinguish >> between a rubbish URI that might have been generated and one you want to >> know about. >> I find the idea that I might give people the expectation that I will >> create >> triples (as your point 2) rather strange - if I knew triples I would >> have >> served them in the first place. Of course if we consider a URI I >> don't know >> as a request for me to go and find knowledge about it, fair enough, >> but I >> would expect a more explicit service for that. In that sense it would >> not be >> a "broken link". >> Maybe the world is different for the other RDFa etc ways of >> publishing LD, >> but in the DB/KB world, I don't see broken incoming links as >> something that >> can be usefully dealt with, other than the maintainer checking what is >> happening, as you do with a normal site. >> ====================================== >> >> Now turning to the second possible meaning. >> We are concerned with the place that gave you the URI, which is possibly >> more interesting. And I think this is actually the case for your TAG >> example. >> If I gave you (by which I mean an agent) such a link and you >> discovered it >> was broken, it would be helpful to me and the LOD world if you could >> tell me >> about it, so I could fix it. In fact it would also be helpful if you >> had a >> suggestion as to the fix (ie a better URI), which is not out of the >> question. And if I trust you (when we understand what that means), I >> might >> even do a replacement or some equivalent triples without further >> intervention. >> >> ====================================== >> In the case of our RKB system, we actually do something like this >> already. >> If we find that there is nothing about a URI in the KB that should >> have it, >> we don't immediately return 404, but look it up in the associated CRS >> (coreference service), and possibly others, to see if there is an >> equivalent >> URI in the same KB that could be used (we do not return RDF from >> other KB, >> although we could). So if you try to resolve >> http://southampton.rkbexplorer.com/description/person-07113 >> You actually get the data for >> http://southampton.rkbexplorer.com/id/person-0a36cf76d1a3e99f9267ce3d0b95e42 >> >> e-06999d58799cb8a3a55d3c69efcc9ba6 and a message telling you to use >> the new >> one next time. >> (I'm not sure we have got the RDF perfectly right, but that is the >> idea.) >> In effect, if we are asked for a broken link, we have a quick look >> around to >> see if there is anything we do know, and give that back. >> Of course, the CRS also gives the requestor the chance to do the same >> fixing >> up. >> The reason that there might be a URI in the KB that has no triples, >> but we >> know about, is because we "deprecate" URIs to reduce the number, and >> then >> use the CRS to resolve from deprecated to non-deprecated. >> So a deprecated URI is one we used to know about, and may still be being >> used "out there", but don't want to continue to use - sort of a >> broken link. >> Hence our dynamic broken link fixing. >> >> Best >> Hugh >> >> PS. >> My choice of http://dbpedia.org/data/Esperanta.rdf as a misspelling of >> http://dbpedia.org/data/Esperanto.rdf turned out to be fascinating. >> It turns out that wikipedia tells me that there used to be a page >> http://en.wikipedia.org/wiki/Esperanta, but it has been deleted. >> So what is returned is different from >> http://en.wikipedia.org/wiki/Esperanti. >> Although http://dbpedia.org/data/Esperanta.rdf and >> http://dbpedia.org/data/Esperanti.rdf both return empty RDF documents, I >> think. >> It looks to me that this is trying to solve a similar problem to that >> which >> our deprecated URIs is doing in our CRS. >> >> >> On 14/02/2009 14:06, "Hausenblas, Michael" <michael.hausenblas@deri.org> >> wrote: >> >>> Kingsley, >>> >>> Grounding in 404 and 30x makes sense to me. However I am still in the >>> conception phase ;) >>> >>> Sent from my iPhone >>> >>> On 12 Feb 2009, at 14:02, "Kingsley Idehen" <kidehen@openlinksw.com> >>> wrote: >>> >>>> Michael Hausenblas wrote: >>>>> Bernhard, All, >>>>> >>>>> So, another take on how to deal with broken links: couple of days >>>>> ago I >>>>> reported two broken links in a TAG finding [1] which was (quickly and >>>>> pragmatically, bravo, TG!) addressed [2], recently. >>>>> >>>>> Let's abstract this away and apply to data rather than documents. The >>>>> mechanism could work as follows: >>>>> >>>>> 1. A *human* (e.g. Through a built-in feature in a Web of Data >>>>> browser such >>>>> as Tabulator) encounters a broken link an reports it to the >>>>> respective >>>>> dataset publisher (the authoritative one who 'owns' it) >>>>> >>>>> OR >>>>> >>>>> 1. A machine encounters a broken link (should it then directly >>>>> ping the >>>>> dataset publisher or first 'ask' its master for permission?) >>>>> >>>>> 2. The dataset publisher acknowledges the broken link and creates >>>>> according >>>>> triples as done in the case for documents (cf. [2]) >>>>> >>>>> In case anyone wants to pick that up, I'm happy to contribute. The >>>>> name? >>>>> Well, a straw-man proposal could be called *re*pairing *vi*ntage link >>>>> *val*ues (REVIVAL) - anyone? :) >>>>> >>>>> Cheers, >>>>> Michael >>>>> >>>>> [1] http://lists.w3.org/Archives/Public/www-tag/2009Jan/0118.html >>>>> <http://lists.w3.org/Archives/Public/www-tag/2009Jan/0118.html> >>>>> [2] http://lists.w3.org/Archives/Public/www-tag/2009Feb/0068.html >>>>> <http://lists.w3.org/Archives/Public/www-tag/2009Feb/0068.html> >>>>> >>>>> >>>> Micheal, >>>> >>>> If the publisher is truly dog-fooding and they know what data objects >>>> they are publishing, condition 404 should be the trigger for a self >>>> directed query to determine: >>>> >>>> 1. what's happened to the entity URI >>>> 2. lookup similar entities >>>> 3. then self fix if possible (e.g. a 302) >>>> >>>> Basically, Linked Data publishers should make 404s another Linked Data >>>> prowess exploitation point :-) >>>> >>>> >>>> -- >>>> >>>> >>>> Regards, >>>> >>>> Kingsley Idehen Weblog: http://www.openlinksw.com/blog/~kidehen >>>> <http://www.openlinksw.com/blog/~kidehen> >>>> President & CEO >>>> OpenLink Software Web: http://www.openlinksw.com >>>> <http://www.openlinksw.com> >>>> >>>> >>>> >>>> >>> >> >> > > -- Regards, Kingsley Idehen Weblog: http://www.openlinksw.com/blog/~kidehen President & CEO OpenLink Software Web: http://www.openlinksw.com
Received on Saturday, 14 February 2009 19:28:02 UTC