Re: imdb as linked open data? from Kingsley Idehen on 2008-04-03 (public-lod@w3.org from April 2008)

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Thu, 03 Apr 2008 09:49:38 -0400
To: Richard Cyganiak <richard@cyganiak.de>
CC: Chris Sizemore <Chris.Sizemore@bbc.co.uk>, public-lod@w3.org, Silver Oliver <Silver.Oliver@bbc.co.uk>, Michael Smethurst <Michael.Smethurst@bbc.co.uk>
Message-ID: <47F4E072.1050209@openlinksw.com>
Richard Cyganiak wrote:
>
>
> On 2 Apr 2008, at 23:18, Chris Sizemore wrote:
>> 1) the licensing seems too restrictive for the purposes of this 
>> community, but has anyone taken the downloadable imdb data and tried 
>> to RDF-ize it? thoughts?
>>
>> http://www.imdb.com/interfaces
>> http://uk.imdb.com/help/show_leaf?usedatasoftware
>>
>> http://glinden.blogspot.com/2008/03/using-imdb-data-for-netflix-prize.html 
>>
>>
>> http://radar.oreilly.com/archives/2006/05/imdb-api.html
>>
>
> I did a partial RDFization of the IMDB data together with Bastian 
> Quilitz for a distributed querying demo (we meshed IMDB data with 
> local movie showtimes). Might dig out the code if someone needs it.
>
>> 2) switching focus a bit, could we/should we be using imdb URIs as 
>> identifiers for Movies, TV Programmes, and TV Programme Episodes, and 
>> (certain) people? i think we should, so, from the best LOD practice 
>> (given that imdb haven't yet pulled a dbpedia and provided 
>> concept/data URIs in addition to their document URLs), shouldn't i use:
>>
>> http://www.imdb.com/title/tt0088846/#thing (to represent the gilliam 
>> film Brazil in BBC RDF...)
>>
>
> From a SemWeb POV this is pretty useless since the URI doesn't resolve 
> to RDF data. Identifiers on the Web are only as good as the data they 
> point to. IMDB URIs point to high-quality web pages, but not to data.
>
> Also, don't squat other people's URI space. IMDB hasn't endorsed the 
> #thing URIs, and simply creating new URIs inside someone else's URI 
> space is considered a violation of the Web's social contract.
>
>> 3) what if i published a site that publicly made available RDF such as:
>>
>> http://www.imdb.com/name/nm0000187/#thing  owl:sameAs  
>> http://musicbrainz.org/artist/79239441-bfd5-4981-a70c-55c3f15c1287.html#thing 
>>
>>
>> or
>>
>> http://www.imdb.com/name/nm0000187/#thing  owl:sameAs  
>> http://zitgist.org/79239441-bfd5-4981-a70c-55c3f15c1287 (or whatever 
>> it is)
>>
>
> Better make your own identifiers. Implementation-wise it might make 
> sense (or not) to re-use their internal IDs (0088846) in your own 
> URIs, so you could have http://yourdomain/movies/0088846#thing .
>
> It's still a good idea to include a link to the IMDB page in your 
> data, e.g. using the foaf:page or foaf:isPrimaryTopicOf property, 
> which can be used to link together things (e.g. movies, people) and 
> web pages about them.
>
> So you could have (in N3 syntax):
>
> <http://yourdomain/people/0000187#thing>
>     a foaf:Person;
>     owl:sameAs <http://zitgist.org/79239441-bfd5-4981-a70c-55c3f15c1287>;
>     owl:sameAs <http://dbpedia.org/resource/Madonna_%28entertainer%29>;
>     foaf:isPrimaryTopicOf <http://www.imdb.com/name/nm0000187/>;
>     .
>
>> in other words, a set of RDF making equivalency statements about 
>> people from imdb across to other datasets like musicbrainz?
>>
>
> The problem is that people in IMDB don't have URIs, and only IMDB is 
> in a position to create them. IMDB only has URIs for web pages, so the 
> best you can do is say something about the IMDB pages, e.g. what their 
> topic is.
>
>> would this community find that useful?
>>
> I think it would be useful.
>
>> in other words, given the imdb licensing realities, are imdb URIs 
>> useful as identifiers even if we can't use the related data?
>>
>
> IMDB URIs are useful because they resolve to high-quality 
> human-readable web pages. This is valuable, because it's a very good 
> way of making clear what our own LOD URIs identify.
>
> The way I describe it above, I wouldn't call it "using IMDB URIs as 
> identifiers", but rather "annotating IMDB pages with links into the 
> Semantic Web".
>
>> are URIs useful in LOD on their own?
>>
>
> As I said, a URI is only as good as what it resolves to. IMDB URIs are 
> part of the old document Web, and only IMDB themselves can upgrade 
> them to the Semantic Web, because they control what comes back when 
> you request the URI.
Richard,

This is a classic example of a post that needs to morph from a mail 
message into a HOWTO for a specific problem scenario: minting URIs for 
3rd party Data Sources exposed via my Data Space. Of course, this mail 
be picked up by crawlers and the like, but none with do it justice.

We should looks towards an efficient mechanism for converting a mail 
thread into a Wiki article, and then from then into a publish item. Of 
course, I've thought about this process for a while and I will try 
expose this capability via http://community.linkeddata.org/ods using 
MediaWiki. Of course each WikiWord will be associated with a URI so the 
emerging Linked Data Graph will be visible to all Linked Data aware user 
agents etc..

Others: As we all have varying availability slots for writing and 
publishing guides, I would encourage others to take the commentary in 
this mailing list as fodder to tutorial and how-to guides in the Wiki 
instance I am about to initialize etc.. All you have to do is start the 
process, others will chime in as they rotated through open slots in 
their working schedules. These actions along will immensely enrich the 
emerging Linked Data Web :-)


Kingsley

>
> Best,
> Richard
>
>
>>
>>
>>
>> sorry for the ramble, but had a lot of imdb on my mind...
>>
>>
>>
>> all the best--
>>
>> --chris sizemore
>>
>>
>>
>>
>> http://www.bbc.co.uk
>> This e-mail (and any attachments) is confidential and may contain 
>> personal views which are not the views of the BBC unless specifically 
>> stated.
>> If you have received it in error, please delete it from your system.
>> Do not use, copy or disclose the information in any way nor act in 
>> reliance on it and notify the sender immediately.
>> Please note that the BBC monitors e-mails sent or received.
>> Further communication will signify your consent to this.
>
>


-- 


Regards,

Kingsley Idehen	      Weblog: http://www.openlinksw.com/blog/~kidehen
President & CEO 
OpenLink Software     Web: http://www.openlinksw.com
Received on Thursday, 3 April 2008 13:50:40 UTC