Re: data format for gathered information from Leo Sauermann on 2007-03-02 (public-sweo-ig@w3.org from March 2007)

From: Leo Sauermann <leo.sauermann@dfki.de>
Date: Fri, 02 Mar 2007 11:34:56 +0100
To: Ivan Herman <ivan@w3.org>
CC: Danny Ayers <danny.ayers@gmail.com>, W3C SWEO IG <public-sweo-ig@w3.org>, Kingsley Idehen <kidehen@openlinksw.com>, Benjamin Nowack <bnowack@appmosphere.com>, Ian Davis <Ian.Davis@talis.com>
Message-ID: <45E7FDD0.3050000@dfki.de>
Hi all,

Answers below.

Es begab sich aber da Ivan Herman zur rechten Zeit 28.02.2007 13:16 
folgendes schrieb:
> Leo,
>
> it is a bit difficult to edit, because the page should reflect
> concensus... so I prefer to comment and discuss here.
>   
The problem is, that the arguments you give here are lost with time. 
Some people don't read all e-mails.

As Uldis pointed out, it is perfectly ok to edit the wiki page or to add 
comments.

[1] http://c2.com/cgi/wiki?VisualizeTheWiki


I understand its harder to read the page and see where arguments can fit 
in, supporting or negating existing arguments,
but I don't have the time to play the role of editor, I can coordinate 
and review.

If it works in wikipedia, I see no reason why it should not work in SWEO.
The page needs to show consensus *at the end* but not during the 
discussion process.

so please, edit / delete / the wiki page! The whole process will take 
very much longer if we depend on a single editor, and we cannot hold our 
deadlines then.

Anyway, answers below:

> - Using the doap, skos, etc, is obviously the way to go. Actually, using
> skos is a great idea of yours!
>
>   
thanks, and I think the consensus is that we use DOAP/foaf/etc for 
"obvious" classes using rdf:type
and use SKOS for the harder tags like "rdfstore", "tutorial", 
"successstory" etc.
> - I am not sure about the usage of RSS. I have the feeling that it is a
> little bit of a misuse here. I wonder whether the full power of DC is
> not enough here; not only the core dc terms like dc:title and such that
> everybody knows but, also, the dcterm vocabulary[1] I have the
> impression that those, combined with maybe some extra properties of our
> own may replace your choice of RSS. (to be checked)
>
>   
The usage of RSS is proper and correct here.
RSS means rich site summary or rdf syndication or ... its for 
syndication of items normally represented
on a website. We are doing syndication (=importing data to a portal).
additionally:
* every dumb developer on the world understands XML and rss (and surely 
not RDF and foaf/doap/...)
* people can setup XML/RSS feeds or RDF/RSS feeds on their CMS systems 
to export data for
  the SWEO portal.
* the data published in XML/RSS can be reused by newsreaders, giving 
more motivation
 to others to follow our track

I am focussing widespread and low-cost adoption, for people that want to 
publish data.

> - For books and articles, I think we need something more strucured, like
> BibTeX, in order to allow for, say, more scholarly usage. The problem is
> that it is not 100% obvious how to represent bibtex in RDF, look at my
> recent blog and the comments[2]. We may have to byte the bullet and
> choose one or modify one).
>   
I don't want to bite the bullet. DC should be enough for now.
Think of it the other way round:
When we say "we will import whatever data you publish" and give a most 
simple example of what we expect
using DC and RSS/SKOS, we can still import more data once people start 
publishing it.

If our importer finds a bibtex item, we may see if we import it.

but we should start now with a simple core, we can always make it 
complicated later, I want the
importer to be online within a month.
> [As an aside, it was one of you guys, I think, who drew my attention on
> BibSonomy[3] which uses nice features to store bibliographical data as
> well, it is a pity that the bibtex they use is broken[2] otherwise we
> could have used it)
>   
Yes, I am a bibsonomy user and I know the main developer personally.
Still, I think this is too experimental.


> - I was looking at DOAP; its description on [4] refers to "DOAP is a
> project to create an XML/RDF vocabulary to describe open source
> projects." I was wondering whether it would also be suitable to describe
> non-commercial projects, ie, where the 'open sourceness' is in DOAP.
> Sure, there are references to repositories and copyrights, but I presume
> it is all right to ignore those when we talk about commercial projects.
> To be checked, nevertheless...
>   
good question, and good to read and check DOAP.
many mojo++ to Bengee for checking.
> - Whether the core 'glue', binding all that together, should be SIOC, as
> Kingsley proposes, or something else, I am not sure. I must admit I am
> not familiar with all the details of SIOC in this sense. I am a little
> bit afraid (just like for RSS) to reuse something just because some of
> the properties and classes are around that are close to what we want,
> but it is not *really* meant for that. I know there is a fuzzy line
> there, and may not apply to SIOC (as I said, I am not sure about that
> one), but we should be careful about that.
>   
The point is: what do we want?
I start a new thread with a new mail about that, please read on there....


> I am sure other issues will pop up...
>   

> Ivan
>
>
> [1] http://dublincore.org/documents/dcmi-type-vocabulary/
> [2] http://ivanherman.wordpress.com/2007/01/13/bibtex-in-rdf/
> [3] http://www.bibsonomy.org
> [4] http://usefulinc.com/doap/
>
>
> Leo Sauermann wrote:
>   
>> Hi Guys,
>>
>> perhaps read the wiki-page in parallel to this email thread.
>> DOAP, FOAF, etc are all mentioned there already,
>> http://esw.w3.org/topic/SweoIG/TaskForces/InfoGathering/DataVocabulary
>>
>> Benjamin, Ivan, you are free to edit the wiki page,
>> just change/adapt it so that it reflects your approach, please start
>> editing.
>> (no edits so far,
>> this is a wiki, free speech, last change wins, anything goes, like
>> wikipedia)
>>
>>
>> Es begab sich aber da Benjamin Nowack zur rechten Zeit 26.02.2007 11:24
>> folgendes schrieb:
>>
>>     
>>> On 22.02.2007 19:55:52, Leo Sauermann wrote:
>>> [...]
>>>  
>>>
>>>       
>>>> I see two things to face, first:
>>>> Describing Information items as such, such as tools, websites, 
>>>> presentaitons, tutorials. This should be done using RSS 1.0, and in some 
>>>> cases when needed extended using DOAP, foaf, etc. This is pretty 
>>>> straightforward, please review and update this site until you agree:
>>>> http://esw.w3.org/topic/SweoIG/TaskForces/InfoGathering/DataVocabulary
>>>>    
>>>>
>>>>         
>>> Not sure about the RSS design decision, it pretty much restricts
>>> the resource types to documents, so we can't really use it as an
>>> "umbrella" spec. My 2 highly redundant cents:
>>> - I found DOAP to work fine for most things software, DCMI provides a 
>>>  number of handy resource type URIs[1] which could be used to augment
>>>  doap:Version resources (e.g. dctype:Collection, dctype:Dataset,
>>>  dctype:InteractiveResource, dctype:Service), or owl:Ontology for
>>>  projects that produce vocabularies (e.g. the FOAF project)
>>>  
>>>
>>>       
>> That was partly already there,
>> please edit the wiki page so that it reflects your exact ideas, but I
>> think the current version already is like you say here.
>>
>>
>>     
>>> - tags (skos:subject, or dc:subject) for more specific stuff (personal
>>>  preference: the more fine-grained skos options)
>>>  
>>>
>>>       
>> ok, one more for SKOS
>>
>>     
>>> - Danny's review vocab[2] for ratings/reviews
>>>  
>>>
>>>       
>> please add this to the wiki page!
>>
>>     
>>> - a combination of the two rdf/iCal specs[3][4] (with and without
>>>  timezone-datatyped timestamps) for events
>>>  
>>>
>>>       
>> they are rather buggy and not clear which one to use, but I would go for
>> the simpler (not-timezone-as-datatype-one).
>>
>>
>>
>> Es begab sich aber da Danny Ayers zur rechten Zeit 22.02.2007 20:25
>> folgendes schrieb:
>>
>>     
>>> Quick thoughts: I see the motivation re. reuse, but rather than trying
>>> to use solely RSS 1.0 for the items, it might be better to use more
>>> precise terms where they exist, as_well_as the RSS terms, e.g.
>>>
>>> <http://example.org/doc> a rss:item; a foaf:Document .
>>>       
>> I also thought about this, but if you require from all participants to
>> do that, it sucks.
>> Why should anyone annotate two types if one is enough? This is the
>> format we expect external data to be in,
>> inference should add the additional triples.
>>
>>     
>>> For the taxo stuff, SKOS sounds a very good idea generally, though I
>>> wouldn't be surprised if there were existing vocabs that could be used
>>> for things like "tutorial" etc.
>>> I'll cc Ian, he hangs around libraries...
>>>
>>> It might also be worth considering (perhaps redundantly again) the Tag
>>> Ontology at
>>> http://www.holygoat.co.uk/projects/tags/
>>>       
>> SKOS covers this and more, so would rather use skos.
>>
>>     
>>> Cheers,
>>> Danny.
>>>
>>>
>>>
>>>       
>> -- 
>> ____________________________________________________
>> DI Leo Sauermann       http://www.dfki.de/~sauermann 
>>
>> Deutsches Forschungszentrum fuer 
>> Kuenstliche Intelligenz DFKI GmbH
>> Trippstadter Strasse 122
>> P.O. Box 2080           Fon:   +49 631 20575-116
>> D-67663 Kaiserslautern  Fax:   +49 631 20575-102
>> Germany                 Mail:  leo.sauermann@dfki.de
>>
>> Geschaeftsfuehrung:
>> Prof.Dr.Dr.h.c.mult. Wolfgang Wahlster (Vorsitzender)
>> Dr. Walter Olthoff
>> Vorsitzender des Aufsichtsrats:
>> Prof. Dr. h.c. Hans A. Aukes
>> Amtsgericht Kaiserslautern, HRB 2313
>> ____________________________________________________
>>
>>     
>
>   


-- 
____________________________________________________
DI Leo Sauermann       http://www.dfki.de/~sauermann 

Deutsches Forschungszentrum fuer 
Kuenstliche Intelligenz DFKI GmbH
Trippstadter Strasse 122
P.O. Box 2080           Fon:   +49 631 20575-116
D-67663 Kaiserslautern  Fax:   +49 631 20575-102
Germany                 Mail:  leo.sauermann@dfki.de

Geschaeftsfuehrung:
Prof.Dr.Dr.h.c.mult. Wolfgang Wahlster (Vorsitzender)
Dr. Walter Olthoff
Vorsitzender des Aufsichtsrats:
Prof. Dr. h.c. Hans A. Aukes
Amtsgericht Kaiserslautern, HRB 2313
____________________________________________________
Received on Friday, 2 March 2007 10:36:12 UTC