RE: Fwd: [backstage] Muddy Boots + BBC Music Beta from Chris Sizemore on 2008-08-06 (public-lod@w3.org from August 2008)

From: Chris Sizemore <Chris.Sizemore@bbc.co.uk>
Date: Wed, 6 Aug 2008 19:39:56 +0100
To: "Kingsley Idehen" <kidehen@openlinksw.com>, "Yves Raimond" <yves.raimond@gmail.com>
Cc: <public-lod@w3.org>
Message-ID: <22E75701DF55CB459F5EC560C366846704BDE37A@bbcxue219.national.core.bbc.co.uk>
as i say, the Muddy Boots folks are very LOD-centric and dbpedia-sympathetic, so we should just mention these constructive criticisms to them and i'm sure they'll be taken seriously.

also to note is that their work is the result of a BBC commission that i was involved with, so i expect the BBC to make the links between News stories and dbpedia/musicbrainz available itself, eventually. also, the Muddyboots entity extraction source code will be available as open source, i do believe.

as rob mentions in his original email, this is proof-of-concept stuff, and linked data wasn't the main use case (though linked data should drop out of it naturally...)


best--

--cs


-----Original Message-----
From: public-lod-request@w3.org on behalf of Kingsley Idehen
Sent: Wed 8/6/2008 5:19 PM
To: Yves Raimond
Cc: public-lod@w3.org
Subject: Re: Fwd: [backstage] Muddy Boots + BBC Music Beta
 

Yves Raimond wrote:
> Hello!
>
> I thought this would be interesting for this list, as an example of a
> service using several LOD sources.
> However, it is a bit sad they don't expose the data they produce as
> linked data themselves - perhaps we should have a GPL-like license for
> LOD datasets "if you derive data from this linked data, it must be
> available as linked data" :-D (just jocking).
>
> Cheers!
> y
>
>
> ---------- Forwarded message ----------
> From: robl <robl@monkeyhelper.com>
> Date: Wed, Aug 6, 2008 at 4:15 PM
> Subject: [backstage] Muddy Boots + BBC Music Beta
> To: backstage@lists.bbc.co.uk
>
>
> Hi,
>
> Over at Rattle [1] we got quite excited by the release of the new
> music beta site, in fact we created our own prototype that links our
> latest iteration of the Muddy Boots system [2] with the beta music
> site.
>
> We haven't announced the latest version of Muddy Boots yet, but in
> essence it's main aim is to 'unambiguously identify the main actors in
> a BBC news story', in doing this it uses DBpedia URI's to identify the
> entities involved. At the moment the system knows about 'people' and
> companies' however we've just added experimental support for 'bands'
> (!).
>
> As DBpedia knows about Musicbrainz guid's and Muddy Boots knows which
> BBC news stories relate to which DBpedia entry for a person|band we
> can add metadata about the related news stories about an artist ('Seen
> in these news stories' at the bottom of the page) :
>
> Coldplay : http://muddy.rattleresearch.com/muddy2/musicentities/cc197bad-dc9c-440d-a5b5-d52ba2e14234
>
> Kaiser Chiefs :
> http://muddy.rattleresearch.com/muddy2/musicentities/90218af4-4d58-4821-8d41-2ee295ebbe21
>
> and of course not forgetting the famous spoken word artist (!) :
>
> George W Bush :
> http://muddy.rattleresearch.com/muddy2/musicentities/06564917-bdd2-4fb6-bcdc-be9e0c04f7ac
>
> We'll be mentioning more about Muddy Boots in the future, but you can
> see we're starting to add semantic markup to BBC News stories and
> creating links between open data sources and BBC News (try clicking a
> news story and you'll see it has the 'actors' in the news story marked
> up with Microformats). It's still at the prototype stage at the moment
> and we're about to enter a formal validation and testing phase to
> measure the systems accuracy - but we thought we'd produce this
> prototype to demonstrate the kinds of things we can start to achieve
> when open data (or web-scale) identifiers are used to identify
> content.
>
> You can see all the music entities the system knows about by viewing :
> http://muddy.rattleresearch.com/muddy2/musicentities/
>
> The support for identifying bands is definitely considered
> 'experimental' at the moment, so you might see the occasional 'blip'
> with related stories as we look at how to classify 'bands' in stories
> more accurately. We're just started indexing BBC news stories in
> anger, so expect to see more related data appear over the next fews
> days and weeks.
>
> We'd love to hear any comments you have about it :)
>
> Thanks,
>
> Rob
>
> [1] http://www.rattleresearch.com
> [2] http://muddyboots.rattleresearch.com/semantic-web-project/
> -
> Sent via the backstage.bbc.co.uk discussion group.  To unsubscribe,
> please visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html.
>  Unofficial list archive:
> http://www.mail-archive.com/backstage@lists.bbc.co.uk/
>
>
>   
Yves,

I've been working on a set of best practices for Attribution by URI for 
LOD. The talks have been private (with a number of folks) for the last 
few weeks, but this post is poignant enough for me to change modes i.e., 
go public.

Yes, it is a shame that consumption of LOD data is already in full swing 
with original source URIs dislocated from the Linked Data Web value 
chain :-(

Bottom line, at the very least, we should collectively seek "Attribution 
by URI" for our data sets. Thus, if you consume 
<http://dbpedia.org/resource/Berlin> don't attribute in any of the 
following lossy forms:

1. "Thanks you DBpedia......"
2.  http://dbpedia.org


What we want is:

1. http://dbpedia.org/resource/Berlin (the URI of the entity Berlin the 
conduit to the data transmission that you consumed)

This can be done as follows:

1. RSS 1.0, RSS 2.0, Atom feed listing the data source names (URIs)
2. RDFa (expressing RSS 1.0)
3. If producing RDF, within the graph using terms from an Attribution 
ontology (we have example that will be pubished)


What these consumers need to be aware of is that HTTP is very powerful, 
meaning, it's not that difficult to dereference all the web resources 
that have consumed a URI but not attributed as requested by the 
publisher :-)

We know who's consuming our URIs (e.g. DBpedia) becuase the leave their 
trails!

The one missing piece of the Linked Data Web bootstrap has been an 
effective license scheme of publicly available Linked Data. I think this 
is about to change in a very big way.



-- 


Regards,

Kingsley Idehen	      Weblog: http://www.openlinksw.com/blog/~kidehen
President & CEO 
OpenLink Software     Web: http://www.openlinksw.com








http://www.bbc.co.uk/
This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated.
If you have received it in error, please delete it from your system.
Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately.
Please note that the BBC monitors e-mails sent or received.
Further communication will signify your consent to this.
Received on Wednesday, 6 August 2008 18:40:36 UTC