Re: updated InfoGathering, proposing a portal as a solution from Ivan Herman on 2007-02-16 (public-sweo-ig@w3.org from February 2007)

From: Ivan Herman <ivan@w3.org>
Date: Fri, 16 Feb 2007 11:01:52 +0100
To: Leo Sauermann <leo.sauermann@dfki.de>
CC: W3C SWEO IG <public-sweo-ig@w3.org>
Message-ID: <45D58110.4040107@w3.org>

Hi Leo,

I made some edits for items that are just facts. I prefer to discuss
others before I make edits:

- We also started to collect references to events (conference,
workshops). What about general presentations on SW?

- I think the crawling should also include Turtle from the start.
Actually, by the time we get there, GRDDL will be pretty much done, I
think it should be considered in the first round!

- Why RSS 0.9 and not 1.0?

For the technical aspect:

The idea of using a crawler may lead to all kinds of technical problems,
though: efficiency, machine usage, etc. I would think that, at least in
the first round, we should restrict ourselves to the collection and
display of data that are 'registered' to us using RDF. I think, in this
respect, being prepared to GRDDL may be crucial: people may then
continue using their HTML pages if they want, they could then annotate
their pages directly, and we could get access to the RDF data. Caveat:
the ontology we develop will have to have a microformat version and we
would have to have a corresponding xslt script at disposal, too. The
same way, we should be prepared to RDFa in the first round, if people
prefer to use that (and RDFa becomes mature). We should not take sides
in using only one of those.

There is an issue whether our portal would regularly 'download' the
referenced RDF data into our own database (say, once a day), or whether
we would always go out and on-the-fly access those. Having a gathering
done once a day would mean that we could offer one big RDF data for the
whole collection right away, possibly with a SPARQL interface to it, too.

I will inquire by our system guys and other team members whether and how
we could host the final system on our site. It is not always obvious...

Ivan

Leo Sauermann wrote:
> 
> Hi SWEO,
> 
> I analysed the information gathering wiki page and have rewritten it
> completly, doing much of the long-needed editing.
> I dumped many todos and read all suggestions made. I summed up
> everything, and gave it some order.
> 
> http://esw.w3.org/topic/SweoIG/TaskForces/InfoGathering
> 
> As a result, I realized that we need a portal website to achieve our
> goals. The goals where to "do something useful that prolongs SWEO, where
> important information (popular, good ranked) can be found, and all
> information can be found".
> Also, several people suggested to have many people involved - and to
> reuse existing sources.
> 
> I took all this and defined a "Semantic Web Information Portal" that
> gathers the Information Resources.
> 
> Ivan, Pasquale, everyone in this task-force:
> !! today/tomorrow would be the perfect moment for you to look at this
> and edit freely !!
> 
> SWEO: once the task force members are done, we present the result in the
> next telco.
> 
> best
> Leo
> 

-- 

Ivan Herman, W3C Semantic Web Activity Lead
URL: http://www.w3.org/People/Ivan/
PGP Key: http://www.cwi.nl/%7Eivan/AboutMe/pgpkey.html
FOAF: http://www.ivan-herman.net/foaf.rdf

Received on Friday, 16 February 2007 10:16:22 UTC