Re: Why should we publish ordered collections or indexes as RDF? from Dan Brickley on 2010-06-03 (semantic-web@w3.org from June 2010)

From: Dan Brickley <danbri@danbri.org>
Date: Thu, 3 Jun 2010 09:01:36 +0200
To: "Haijie.Peng" <haijie.peng@gmail.com>
Cc: Linked Data community <public-lod@w3.org>, semantic-web <semantic-web@w3.org>
Message-ID: <AANLkTim8M6SWvtpht_bNqJ5dKl5-IKzvE0K3owaj_chQ@mail.gmail.com>

2010/6/3 Haijie.Peng <haijie.peng@gmail.com>:
> [Apologies for cross-posting]
>
> Why should we publish ordered collections or indexes as RDF? is it necessary?

On the Web, very little is 'necessary'. But some things can be useful.
Indexes and summaries can help software prioritise, and allow larger
files to be loaded only when needed.

It depends what you mean by 'ordered collections' and 'indexes'. But
the reason for sitemap-style summaries is usually to help external
sites monitor the content of the Web better.

At http://www.sitemaps.org/ there is an explanation of the sitemaps
format which several crawlers use. I believe the Google crawler will
use it to help schedule activity on a site, and that -for example- it
can help if you want your RDF/FOAF or XFN documents to be indexed
byGoogle's Social Graph API - http://code.google.com/apis/socialgraph/

There is also a version of this format called Semantic Sitemaps, but
http://sw.deri.org/2007/07/sitemapextension/ is offline right now.

In other cases, RSS feeds (also Atom) do the same thing, and provide a
'What's new' feed for a site, letting everyone know which documents
are new or updated, so that they can be (re-)indexed.

For large collections of documents, it is useful sometimes to have
smaller summary documents so that the bigger files can be fetched only
when they are needed. Mobile apps that care about bandwidth are an
example scenario there.

Regarding Linked Data, what we do there is to link descriptions
together. Each partial description often links to other documents that
are about the same real-world thing. This addresses some of the same
needs as a top level index or catalogue, because you can retrieve
different levels of detail from different sites. So my small FOAF file
is in some ways a top level "entry" (index?) for me, and it might
point to larger files (eg. twitter or flickr datasets) that are
maintained separately. RDF aggregator sItes like sindice.com can be
used to link these together, even if the top level file does not
contain links to every other file that mentions me. So in that
scenario, it is not 100% necessary for the small file to be an index
to the large files. The data can be linked together later if common
identifiers are used in each data set.

Hope this helps. Can you say more about the specific situation you have in mind?

cheers,

Dan

Received on Thursday, 3 June 2010 07:02:15 UTC