Re: updated InfoGathering, proposing a portal as a solution - do you agree on a portal?

Hi Danny, SWEO

Es begab sich aber da Danny Ayers zur rechten Zeit 16.02.2007 20:42 
folgendes schrieb:
>
> A portal should be a reasonable way of presenting the material, and
> more critically sounds an achievable goal (assuming maintenance can
> somehow be taken care of).
I think thats an opportunity to either make SWEO last longer or ask some 
W3C member or the W3C itself. More in my replies to Ivan's e-mails.

>
> But I'm not sure I understand the requirements list on the Wiki -  why
> should it be MySQL/PHP? We only need one portal, no? Are we sure there
> isn't an existing system that could do the job (or at least 90% of
> it)? 
I did never setup such website, but I know that reusing stuff is important.
If there is an existing system that does 80% of the job, we must use it 
to save costs and be bugfree.

> If there isn't an RDF-based system that fits the bill, then
> surely there's something that can at least expose RDF (Drupal
> perhaps?). 
I looked at the drupal.org website, this looks good. I would say we may 
hack this until it does what we need.
Any alternatives?

> The primary objective is Information Gathering, not
> software development, however appealing that may be for demo purposes.
yes!
>
> (Whatever, there's always RAP).
>
> more comments inline -
>
> On 16/02/07, Leo Sauermann <leo.sauermann@dfki.de> wrote:
>
>>  We will provide a portal integrating data and providing user 
>> interfaces to
>> edit the most important information resources - so the pain to keep 
>> up to
>> date should be forwarded to people like Dave Beckett, who keeps his 
>> list of
>> Tools anyway (he just now either uses the portal to manage the list or
>> publishes his data as RDF/XML)
>
> I see no reason not to leave the pain of finding new stuff to people
> like Dave, but I really don't think it's reasonable to expect them to
> change their current practice (unless they really want to), or check
> for stale items.
>
> For Dave's list a bit of XSLT & a little manual tweaking should be
> enough to get it in RDF (I've a feeling I started one sometime last
> year - not sure how far I got). A periodic automatic check for 404s &
> a human-reporting mechanism should be adequate for dead sites.
Yes, thats what I meant with "crawler", but I will further call it "data 
importer" to be more clear.
I changed the parts on the wiki page:

"For importing data from RDF sources, we propose a RDF vocabulary (see 
below) how to describe lists of resources. We inform the authors of 
existing lists about this format and propose the authors of these lists 
to publish their lists using it. Once an external author has registered 
the URL of his RDF with the sweo data importer, SWEO Portal will 
continually read it (for example, daily) and update the SWEO db."


>
>>  Before the architecture, I would define the user experience.
>>  Features first, then architecture.
>
> Software agents are users too! I would hope all the data will be
> available to remote systems as RDF, and ideally via SPARQL too (plus
> Atom/RSS for newsreaders). It might be worth investigating automated
> 3rd party addition of entries, along the lines of CodeZoo's
> DOAP-over-Atom [1] and/or Pingthesemanticweb. Ok, a bit of development
> may be needed...
Not much if we use D2RQ mapping (Chris Bizer's stuff), which can publish 
any MySQL database as SPARQL endpoint using a DB-to-Schema mapping. 
Although Chris' work may not do exactly what we need, I think it is 
tweakable until it does.
(and he or Richard Cyganiak is probably interested to contribute anyway)

>
> One point that should probably be considered early on is
> licensing/copyright. We should be aiming for maximally open data here.
> Any automated parts likely need Creative Commons awareness, otherwise
> permission needs asking...
point there.

I added this idea to the qiki page:


    Copyright of Information Items

Who has the copyright on the managed information descriptions? It 
shouldn't be a big problem, as the texts are rather short (a summary of 
a website) and the lists are not very creative work or innovative. It 
may be enough to assume one copyright for all content. It is important 
that the data gathered on the portal is reusable in other contexts (by 
Semantic Web agents). A suitable license would be cc-attributions 
license <http://creativecommons.org/licenses/by/2.5/> (allowing 
commercial use and derivations).

Alternative suggestions to handle copyright:

    *

      Items aggregated to the SWEO Information Portal can be licensed
      under a creative commons license. (This may be a little
      complicated to implement). For this, each author can choose a
      license.

    *

      all data is considered to be of one license. People who register
      URLs for being automatically imported have to check a box saying
      "the feed conforms to the license".



thanks for the input, its good to discuss and plan these things with you 
guys!
kindest regards
Leo
>
> Cheers,
> Danny.
>
> [1] http://www.codezoo.com/about/doap_over_atom.csp
>
>


-- 
____________________________________________________
- DFKI bravely goes where no man has gone before -
We will move to our new building by end of February 2007.

The new address will be as follows:
    Trippstadter Straße 122
    D-67663 Kaiserslautern

My phone/fax numbers will also change:
Phone:    +49 (0)631 20575 - 116
Secr.:    +49 (0)631 20575 - 101
Fax:      +49 (0)631 20575 - 102
Email remains the same
____________________________________________________
DI Leo Sauermann       http://www.dfki.de/~sauermann 
Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
Trippstadter Strasse 122
P.O. Box 2080          Fon:   +49 631 205-3503
D-67663 Kaiserslautern Fax:   +49 631 205-3472
Germany                Mail:  leo.sauermann@dfki.de
____________________________________________________
Geschaeftsfuehrung:
Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
Dr. Walter Olthoff

Vorsitzender des Aufsichtsrats:
Prof. Dr. h.c. Hans A. Aukes

Amtsgericht Kaiserslautern, HRB 2313
____________________________________________________

Received on Monday, 19 February 2007 10:48:34 UTC