W3C home > Mailing lists > Public > public-lod@w3.org > August 2010

Re: Linked Data and IRI dereferencing (scale limits?)

From: Giovanni Tummarello <giovanni.tummarello@deri.org>
Date: Thu, 5 Aug 2010 10:10:24 +0200
Message-ID: <AANLkTin6_ohc6-mpRvkSbwP3XFyKMXqunOkVNNcuVxwN@mail.gmail.com>
To: Jörn Hees <j_hees@cs.uni-kl.de>
Cc: public-lod <public-lod@w3.org>
Jorn you're right.

"linked data" with plain dereferenciable URIs it plain doesnt work once you
move from the simplest examples.  This is for some of the reasons you
mention as well as other others  (e.g. how do you really ask what are the
1000 URis most visited (assuming this was in the DB) or the "100 biggest
cities" or "what is the URI which is sameas geonames:united_states" . You

anyway see below

1. DBpedia still uses skos:subject quite often, even though it's deprecated.

If you look the URI http://www.w3.org/2004/02/skos/core#subject I'm silently
> redirected to the current skos definition http://www.w3.org/TR/skos-
> reference/skos.html#subject<http://www.w3.org/TR/skos-%0Areference/skos.html#subject>,
> but there is no #subject in it anymore. This
> means: no rdfs:label for a property which is ubiquitous in DBpedia.
> Am I missing out some Header option for the content negotiation or is this
> a
> problem of the w3.org end?
in http://sig.ma to get something we hope looks like a label we often have
to split the URI in the end..
Make yourself a local cache with those ontologies as long as the URI
semantics doesnt change (it shouldnt) you cna then give a local label.

> 2. When dereferencing DBpedia URIs I repeatedly found a suspiciously equal
> number of triples per fetched IRI in the local cache: 2001 triples,
> sometimes
> 2002. I remembered: ah, yes...

> There is no rdfs:label, no rdf:type, etc. in it, while all these useful
> things
> are in the HTML version.
> I'm not pointing this out to say that there is a problem in DBpedia. I
> think
> this is a serious problem of scale. How do you decide what is useful for
> someone dereferencing your URIs? How do you keep unnecessary traffic low at
> the
> same time?
> I think maybe a few standard triples should be included in any case (e.g.,
> rdfs:label, rdf:type, ...),

Only solution for you now is to use SPARQL instead of resolving the URI.
Much less traffic and it would actually work (and less parsing on your

or ask the HTML side, if there is RDFa bingo, there are very good reason why
this should indeed be the only way one should tell people to serve data [1].
With RDFa out there i personally hope redirections become a thing of the
past (not negotiation, negotiatin is transparently good. Negotiating an RDF
version and getting it shold always be possible).

Maybe something could be done in the future by adding special values in a
dereferencing response like "igotmoreoftheseaskifyouneedthem" . This has
been proposed but not standardized/ implemented AFAIK  etc.

cheers and show us the final result :-)

[1] http://tantek.com/log/2005/06.html#d03t2359
Received on Thursday, 5 August 2010 08:10:53 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:21:04 UTC