Re: data format for gathered information from Ivan Herman on 2007-03-01 (public-sweo-ig@w3.org from March 2007)

From: Ivan Herman <ivan@w3.org>
Date: Thu, 01 Mar 2007 09:30:55 +0100
To: Uldis Bojars <uldis.bojars@deri.org>
Cc: 'Leo Sauermann' <leo.sauermann@dfki.de>, 'Danny Ayers' <danny.ayers@gmail.com>, 'W3C SWEO IG' <public-sweo-ig@w3.org>, 'Kingsley Idehen' <kidehen@openlinksw.com>, 'Benjamin Nowack' <bnowack@appmosphere.com>, 'Ian Davis' <Ian.Davis@talis.com>
Message-ID: <45E68F3F.3090708@w3.org>
Hi Uldis,

as I said in my previous mail, I do not know the details of SIOC and it
also seems that it is an evolving spec. That is all good. If *you* feel
that it can play the role of a 'glue' (and even let the technology
evolve in this direction if needed), then I have absolutely no problem
with it!

Thanks!

Cheers

I.

Uldis Bojars wrote:
> Ivan,
> 
> SIOC as a framework can act as the 'glue'.
> I agree that if deciding to reuse an ontology we should use it for what it
> is meant for. Let me clarify some details about SIOC.
> 
> 1) It already uses FOAF and SKOS
> 
> SIOC re-uses FOAF to express information about persons and lets you use SKOS
> to describe categories and tags. The largest part of data generated by a
> community site is about posts (as there are more posts than there are people
> and categories) expressed in SIOC and it already acts as a 'glue' between
> FOAF and SKOS.
> 
> Figure by John Breslin illustrating these relations:
> http://sioc-project.org/node/158
> 
> 2) Describing everything in RDF
> 
> People want to provide information and comments about real-world objects
> (Events, Videos, Books, Presentations, Wiki pages, CVs, ...) not just about
> forum/blog posts. People also want to be able to say that their posts
> contain or are about these real-world objects. This question was recently
> discussed by the SIOC community and a decision on how to do this within the
> SIOC framework will be made within the next 2 weeks.
> 
> SIOC was made to be generic and some of the objects (Blog posts, Mailing
> lists, Wiki pages) can be be naturally expressed as a sioc:Post. 
> 
> For other objects a sioc:Post itself is not a natural choice and there's no
> need to "stretch" it. That's why we are thinking about a generic class for
> these objects that will act as an "ubrella" for all kinds of things. It does
> not need to contain actual properties to describe these things - there are
> already ontologies out there to describe Projects, Books, etc. What we need
> is a way how to talk about all these things [within sioc:Posts and in
> general] and a "crystallisation point" from which to point to the different
> ontologies to use. 
> 
> Some types of relations that we want to express:
>  - a Post contains an Object (e.g., a review)
>  - a Post is about an Object (e.g., an project)
>  - a Post is categorised as category/tag/topic X  (currently expressed with
> a sioc:topic and a URI which can [optionally] be a skos:Concept)
> 
> We have similar questions to solve, would probably come to similar
> conclusions and can benefit from learning from the other. In fact, the
> Semantic Web community is like any other community who wants to publish
> information and discussions about things. 
> 
> If you have suggestions how to model this information then please send them
> to SIOC-Dev list [1]. Note that when talking about a generic "umbrella"
> class it does not really matter what namespace it is in as long as there is
> one. If there is an existing vocabulary we can reuse it.
> 
> 3) Community aspects of SIOC
> 
> Besides expressing information about things in general there are some
> community site related SIOC usage patterns that can be useful:
> 
> Discussions / comments about the information gathered can be expressed as a
> sioc:Post + its properties. 
> sioc:has_reply property is used to link a post to its replies and comments.
> That's where SIOC fits in naturally.
> 
> sioc:Community is a recent addition to ontology, introduced to describe a
> collection of different things belonging to a community. Basically, anything
> (website, mailing list, people) can be a part of it. It may used to describe
> information about communities (a part of the gathered information) in case
> when a community means more than a group of people. 
> 
> This concludes the introduction, hope it helps to clarify some questions.
> SIOC is a live project and lessons learned from describing gathered
> information can also feed back into its development. Please feel free to
> send comments and ask any questions.
> 
> [1] http://groups.google.com/group/sioc-dev
> 
> Best,
> Uldis
> 
> [ http://captsolo.net/info/ ]
> 
> 
> -----Original Message-----
> From: public-sweo-ig-request@w3.org [mailto:public-sweo-ig-request@w3.org]
> On Behalf Of Ivan Herman
> Sent: Wednesday, February 28, 2007 12:17 PM
> To: Leo Sauermann
> Cc: Danny Ayers; W3C SWEO IG; Kingsley Idehen; Benjamin Nowack; Ian Davis
> Subject: Re: data format for gathered information
> 
> Leo,
> 
> it is a bit difficult to edit, because the page should reflect concensus...
> so I prefer to comment and discuss here.
> 
> - Using the doap, skos, etc, is obviously the way to go. Actually, using
> skos is a great idea of yours!
> 
> - I am not sure about the usage of RSS. I have the feeling that it is a
> little bit of a misuse here. I wonder whether the full power of DC is not
> enough here; not only the core dc terms like dc:title and such that
> everybody knows but, also, the dcterm vocabulary[1] I have the impression
> that those, combined with maybe some extra properties of our own may replace
> your choice of RSS. (to be checked)
> 
> - For books and articles, I think we need something more strucured, like
> BibTeX, in order to allow for, say, more scholarly usage. The problem is
> that it is not 100% obvious how to represent bibtex in RDF, look at my
> recent blog and the comments[2]. We may have to byte the bullet and choose
> one or modify one).
> 
> [As an aside, it was one of you guys, I think, who drew my attention on
> BibSonomy[3] which uses nice features to store bibliographical data as well,
> it is a pity that the bibtex they use is broken[2] otherwise we could have
> used it)
> 
> - I was looking at DOAP; its description on [4] refers to "DOAP is a project
> to create an XML/RDF vocabulary to describe open source projects." I was
> wondering whether it would also be suitable to describe non-commercial
> projects, ie, where the 'open sourceness' is in DOAP.
> Sure, there are references to repositories and copyrights, but I presume it
> is all right to ignore those when we talk about commercial projects.
> To be checked, nevertheless...
> 
> - Whether the core 'glue', binding all that together, should be SIOC, as
> Kingsley proposes, or something else, I am not sure. I must admit I am not
> familiar with all the details of SIOC in this sense. I am a little bit
> afraid (just like for RSS) to reuse something just because some of the
> properties and classes are around that are close to what we want, but it is
> not *really* meant for that. I know there is a fuzzy line there, and may not
> apply to SIOC (as I said, I am not sure about that one), but we should be
> careful about that.
> 
> I am sure other issues will pop up...
> 
> Ivan
> 
> 
> [1] http://dublincore.org/documents/dcmi-type-vocabulary/
> [2] http://ivanherman.wordpress.com/2007/01/13/bibtex-in-rdf/
> [3] http://www.bibsonomy.org
> [4] http://usefulinc.com/doap/
> 
> 
> Leo Sauermann wrote:
> 
>>Hi Guys,
>>
>>perhaps read the wiki-page in parallel to this email thread.
>>DOAP, FOAF, etc are all mentioned there already, 
>>http://esw.w3.org/topic/SweoIG/TaskForces/InfoGathering/DataVocabulary
>>
>>Benjamin, Ivan, you are free to edit the wiki page, just change/adapt 
>>it so that it reflects your approach, please start editing.
>>(no edits so far,
>>this is a wiki, free speech, last change wins, anything goes, like
>>wikipedia)
>>
>>
>>Es begab sich aber da Benjamin Nowack zur rechten Zeit 26.02.2007
>>11:24 folgendes schrieb:
>>
>>
>>>On 22.02.2007 19:55:52, Leo Sauermann wrote:
>>>[...]
>>> 
>>>
>>>
>>>>I see two things to face, first:
>>>>Describing Information items as such, such as tools, websites, 
>>>>presentaitons, tutorials. This should be done using RSS 1.0, and in 
>>>>some cases when needed extended using DOAP, foaf, etc. This is pretty 
>>>>straightforward, please review and update this site until you agree:
>>>>http://esw.w3.org/topic/SweoIG/TaskForces/InfoGathering/DataVocabular
>>>>y
>>>>   
>>>>
>>>
>>>Not sure about the RSS design decision, it pretty much restricts the 
>>>resource types to documents, so we can't really use it as an 
>>>"umbrella" spec. My 2 highly redundant cents:
>>>- I found DOAP to work fine for most things software, DCMI provides a
>>> number of handy resource type URIs[1] which could be used to augment
>>> doap:Version resources (e.g. dctype:Collection, dctype:Dataset,
>>> dctype:InteractiveResource, dctype:Service), or owl:Ontology for
>>> projects that produce vocabularies (e.g. the FOAF project)
>>> 
>>>
>>
>>That was partly already there,
>>please edit the wiki page so that it reflects your exact ideas, but I 
>>think the current version already is like you say here.
>>
>>
>>
>>>- tags (skos:subject, or dc:subject) for more specific stuff (personal
>>> preference: the more fine-grained skos options)
>>> 
>>>
>>
>>ok, one more for SKOS
>>
>>
>>>- Danny's review vocab[2] for ratings/reviews
>>> 
>>>
>>
>>please add this to the wiki page!
>>
>>
>>>- a combination of the two rdf/iCal specs[3][4] (with and without
>>> timezone-datatyped timestamps) for events
>>> 
>>>
>>
>>they are rather buggy and not clear which one to use, but I would go 
>>for the simpler (not-timezone-as-datatype-one).
>>
>>
>>
>>Es begab sich aber da Danny Ayers zur rechten Zeit 22.02.2007 20:25 
>>folgendes schrieb:
>>
>>
>>>Quick thoughts: I see the motivation re. reuse, but rather than 
>>>trying to use solely RSS 1.0 for the items, it might be better to use 
>>>more precise terms where they exist, as_well_as the RSS terms, e.g.
>>>
>>><http://example.org/doc> a rss:item; a foaf:Document .
>>
>>I also thought about this, but if you require from all participants to 
>>do that, it sucks.
>>Why should anyone annotate two types if one is enough? This is the 
>>format we expect external data to be in, inference should add the 
>>additional triples.
>>
>>
>>>For the taxo stuff, SKOS sounds a very good idea generally, though I 
>>>wouldn't be surprised if there were existing vocabs that could be 
>>>used for things like "tutorial" etc.
>>>I'll cc Ian, he hangs around libraries...
>>>
>>>It might also be worth considering (perhaps redundantly again) the 
>>>Tag Ontology at http://www.holygoat.co.uk/projects/tags/
>>
>>SKOS covers this and more, so would rather use skos.
>>
>>
>>>Cheers,
>>>Danny.
>>>
>>>
>>>
>>
>>
>>--
>>____________________________________________________
>>DI Leo Sauermann       http://www.dfki.de/~sauermann 
>>
>>Deutsches Forschungszentrum fuer
>>Kuenstliche Intelligenz DFKI GmbH
>>Trippstadter Strasse 122
>>P.O. Box 2080           Fon:   +49 631 20575-116
>>D-67663 Kaiserslautern  Fax:   +49 631 20575-102
>>Germany                 Mail:  leo.sauermann@dfki.de
>>
>>Geschaeftsfuehrung:
>>Prof.Dr.Dr.h.c.mult. Wolfgang Wahlster (Vorsitzender) Dr. Walter 
>>Olthoff Vorsitzender des Aufsichtsrats:
>>Prof. Dr. h.c. Hans A. Aukes
>>Amtsgericht Kaiserslautern, HRB 2313
>>____________________________________________________
>>
> 
> 

-- 

Ivan Herman, W3C Semantic Web Activity Lead
URL: http://www.w3.org/People/Ivan/
PGP Key: http://www.cwi.nl/%7Eivan/AboutMe/pgpkey.html
FOAF: http://www.ivan-herman.net/foaf.rdf
Received on Thursday, 1 March 2007 08:30:47 UTC