W3C home > Mailing lists > Public > semantic-web@w3.org > September 2009

Re: tweet2rdf vocabulary convergence

From: Benjamin Nowack <bnowack@semsol.com>
Date: Mon, 28 Sep 2009 13:46:32 +0200
To: Michael Hausenblas <michael.hausenblas@deri.org>
Cc: Semantic Web community <semantic-web@w3.org>
Message-ID: <PM-GA.20090928134632.9D69C.2.1D@semsol.com>
On 28.09.2009 11:41:06, Michael Hausenblas wrote:
[...]
>I'd be very interested to engage in this. Please let me know if you have
>concrete steps planned. Maybe first review existing vocabs to see what is
>covered already, today?
I think a lot is available already, so what we mainly need is probably
mainly a "suggested vocabulary blend":
 * type
      * rss:item
      * sioc:MicroblogPost
 * all typical rss properties
      * rss:title
      * rss:description
      * content:encoded
      * dc:subject (hashtags)
      * dc:creator for the author name
 * author stuff
      * dct:creator for author resource (profile vs person issue?)
      * sioc:has_creator for author profile/account URL
      * sioc:avatar / atom:link
 * links
      * dct:references (short or expanded? prolly the latter)
 * Semweb URIs
      * dct:subject
 * threads
      * sioc:reply_of

What's missing (not sure which are really useful or possible):
 * type
      * ex:DirectMessage
      * ex:ReTweet
 * targetUser (@user)
 * mentioned user (... @user)
 * referred by ( ... via @user)
 * machine/triple tags?
 * derived stats?
      * number of re-tweets
      * number of url posts
 * client app (needed? there's twitter:source, but has html)
 * ratings (if so, which notation?)
 * group (identi.ca uses !group, but doesn't have rdf, I think)

Benji

--
Benjamin Nowack
http://bnode.org/
http://semsol.com/


>
>Cheers,
>      Michael
>
>-- 
>Dr. Michael Hausenblas
>LiDRC - Linked Data Research Centre
>DERI - Digital Enterprise Research Institute
>NUIG - National University of Ireland, Galway
>Ireland, Europe
>Tel. +353 91 495730
>http://linkeddata.deri.ie/
>http://sw-app.org/about.html
>
>
>
>> From: Benjamin Nowack <bnowack@semsol.com>
>> Organization: semsol.com
>> Reply-To: Benjamin Nowack <bnowack@semsol.com>
>> Date: Mon, 28 Sep 2009 11:35:53 +0200
>> To: Semantic Web community <semantic-web@w3.org>
>> Subject: tweet2rdf vocabulary convergence
>> Resent-From: Semantic Web community <semantic-web@w3.org>
>> Resent-Date: Mon, 28 Sep 2009 09:36:31 +0000
>> 
>> 
>> Hi,
>> 
>> Morton Swimmer suggested that there might be broader interest to talk
>> a bit about RDF extracted from tweets, so here we go:
>> 
>> There are multiple tools and services that convert twitter profiles
>> and contacts to RDF (e.g semantictweet[1] or knowee), I think they all
>> mostly re-use stuff from FOAF and don't really need new terms.
>> 
>> But there are also tools that convert individual tweets to RDF
>> (I think Tom Morris had code. smesher is another example), or the
>> other way round (e.g. SMOB). Streams can nicely be grounded in RSS,
>> possibly with an additional sioc:MicroblogPost type, but what about
>> the semi-structured data? Should we try to create a shared vocab for
>> such in-tweet data (recipient, mentioned people, author-avatar/profile,
>> tags, machine tags, short urls, expanded urls, re-tweets, vias,
>> embedded Linked Data URIs, groups, DM, ...)?
>> 
>> I've been playing a bit with in-tweet structures[2] a while ago, but
>> so far mainly made up app-specific terms. For a new project, I'm
>> extracting ratings and moods (via evolving patterns similar to
>> nanoformats [3], twitterdata[4], or simple word lists). I'm again
>> making up one-off terms here, too, and could surely benefit from a
>> more stable vocab.
>> 
>> Anyone interested in exploring this a little further? VoCamp near
>> Düsseldorf or Amsterdam, maybe? ;)
>> 
>> Cheers,
>> Benji
>> 
>> 
>> [1] http://semantictweet.com/
>> [2] http://www.smesher.org/media/2009/02/13/SMR_RDFExtractor.phps
>> [3] http://microformats.org/wiki/microblogging-nanoformats
>> [4] http://twitterdata.org/
>> 
>> --
>> Benjamin Nowack
>> http://bnode.org/
>> http://semsol.com/
>> 
>> 
>
Received on Monday, 28 September 2009 11:47:08 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 1 March 2016 07:42:15 UTC