RE: imdb as linked open data?

From: Chris Sizemore <Chris.Sizemore@bbc.co.uk>
Date: Fri, 4 Apr 2008 13:38:05 +0100
Message-ID: <22E75701DF55CB459F5EC560C366846704C15960@bbcxue219.national.core.bbc.co.uk>
To: <public-lod@w3.org>
Cc: "Michael Smethurst" <Michael.Smethurst@bbc.co.uk>, "Silver Oliver" <Silver.Oliver@bbc.co.uk>, <pepper@ontopia.net>
so, i was correct in thinking that imdb is interesting to the LOD
i agree that offering "what's a/the Sem Web business model?" is pretty
important in order to get buy in... does anyone have any contacts in and
around imdb?
***************** forgive the following if it's controversial -- i'm
honestly just trying to understand better ***********
however, on a more philosophical note, i DON'T think imdb neccesarily
needs to explicitly opt into the Web of Data in order for the world at
large to find Sem Web value in that data... i suppose it would be very
desirable for imdb to officially provide Open Data/rdf of their content,
but i don't think that's the only way for the Sem Web to gain value from
basically, my premise is this: imdb is on the Web of Docs, and that's
good enough for the purpose of answering the question to be posed here
-- http://www.okkam.org/IRSW2008/ (the problem of identity and reference
on the Semantic Web is perhaps the single most important issue for
reaching a global scale. Initiatives like LinkedData, OntoWorld and the
large number of proposals aiming at using popular URLs (e.g.
Wikipedia's) as "canonical" URIs (especially for non informational
resources) show that a solution to this issue is very urgent and very
at this point in my indoctrination to LOD (i'm a long time semweb
fanboy, tho), i guess i disagree with: "From a SemWeb POV this
<http://www.imdb.com/title/tt0088846/#thing> ] is pretty useless since
the URI doesn't resolve to RDF data. Identifiers on the Web are only as
good as the data they point to. IMDB URIs point to high-quality web
pages, but not to data." -- clearly i understand the difference between
"data" and "web page" here, but i don't agree that it's so black and
white. i'd suggest: "Identifiers on the Web are only as good as the
clarity of what they point to..." i don't think there has to be RDF at
the other end to make a URI useful, in many cases...
at this point, for example at the BBC, my view is that identifiers and
equivalency relationships are more important than RDF... just barely
more important, granted... having a common set of identifiers, like
navigable stars in the sky over an ocean, is what we need most now, in
order to help us aggregate content across the org, and also link it up
to useful stuff outside our walled garden.
so, i'm one of those who feel that websites like imdb, wikipedia, and
musicbrainz provide great identifiers for non-information resources even
in their Web of Docs form. i know that most of you here will feel that
this is lazy, too informal, and naive of me. but my argument is that,
for sites like those i mention (not all websites, by any means) we may
as well, for the purposes of our day to day use cases, use their URLs as
if they were Sem Web URIs. on these sites, the distinction between
resource and representation (concept and doc about concept) is not
what's pertinent.
i'm aware that most on this list will make a religious distinction
but i think that, by convention, and in the contexts they'd actually be
used, we should treat them both as identifiers for the same concept, and
that they are essentially sameAs's *in common practice"...
in other words, as much as i love dbPedia and think it's a brilliant
step forward, i personally was fine with WIkipedia URLs as identifiers.
the incredible thing about dbpedia is the data mining to extract RDF,
not the URIs or content negotiation.
i KNOW that, technically, what i'm saying breaks all our rules -- and i
closely -- but philosophically i think there's something to what i'm
saying... if the Web is easy and the Sem Web hard, must we insist on
perfection? must we insist that imdb agree with us and explicitly opt
practically, tho, in an "official" LOD grammar sense, this works just
fine for me: 

<http://dbpedia.org/resource/Madonna_%28entertainer%29> >
foaf:isPrimaryTopicOf <http://www.imdb.com/name/nm0000187/
<http://www.imdb.com/name/nm0000187/> >

<http://dbpedia.org/resource/Madonna_%28entertainer%29> >
foaf:isPrimaryTopicOf http://en.wikipedia.org/wiki/Madonna_(entertainer
<http://en.wikipedia.org/wiki/Madonna_(entertainer> )

that seems useful and easy. to me, that's allowing a "sameAs"-like
relationship between Web of Docs URLs and SemWeb URIs... i could really
really run with that approach...


but now, to stir things up a bit...

given the above, thus:

<http://en.wikipedia.org/wiki/Madonna_(entertainer> ) owl:sameAs
<http://www.imdb.com/name/nm0000187/> >

right? right?  ;-)


From: public-lod-request@w3.org [mailto:public-lod-request@w3.org] On
Behalf Of Sergey Chernyshev
Sent: 03 April 2008 17:47
To: public-lod@w3.org
Subject: Re: imdb as linked open data?

Yes, it's exactly the thing I was thinking about - what is the business
model (or at least approach that can bring money) for content providers

1.	create data 
2.	release it under open (or not so open) license so other parties
can freely use it
3.	and spend money on RDFizing it

I think, until this is resolved, Semantic Web is not going to blossom
and go far beyond open data.

Publishers are fighting for attention because current business model is
based on advertising (other models like micropayments, payment
propagation from ISPs to content providers and so on didn't work out).
That's why they are happy to give money and optimize their content to
Google standards for SEO purposes, but what will make them RDFize their

But in reality it's not all that bad - RSS showed that people are
interested in opening their content and adding structure to it if users
come back to their site to enjoy full experience. It's just a question
of what level of open data will those big (or not so big) publishers
open to public and at which point will users need to go back to their
site to see the ads. Or maybe see the ads withing the consuming

In any case, I think it's a big question worth discussing, unfortunately
I didn't see any business-related sessions on LinkedData Planet.


On Thu, Apr 3, 2008 at 10:48 AM, Hugh Glaser <hg@ecs.soton.ac.uk> wrote:

	On 03/04/2008 12:41, "Kingsley Idehen" <kidehen@openlinksw.com>
	> Hugh Glaser wrote:
	>> Hugh
	> Hugh,
	> This is an example of many to come, where LOD needs to pitch
the value
	> of Linked Data to Information Publishers :-) I think they will
	> ultimately publish and host their own RDF Linked Data once the
	> value is clear to them.
	And when there is also actual extrinsic value? :-)
	But yes, and making it easy for them, possibly by actually doing
it for
	them, is part of the bootstrap process.
	The thing I am trying to work out is exactly how to make the
pitch that fits
	with their business model, and where their profit line might
come from.
	This requires a serious understanding of the detailed business
model for the
	company in question (which is not necessarily a skill the an
academic SW
	researcher has!).
	We also have similar LOD installations for CORDIS (the EU
funding agencies'
	DB), NSF (a US funding agency), EPSRC (a UK funding agency), and
ACM, among
	others. We have now engineered them so that they can be moved to
	Information Publisher if desired. Such organisations sometimes
have it as
	part of their remit to publicise the results, so they should be
easier to
	deal with, in theory.
	If anyone has a ready conduit to the appropriate place in such
	organisations, we would be delighted to talk with them, showing
them what
	might be done.
	> --
	> Regards,
	> Kingsley Idehen       Weblog:
	> President & CEO
	> OpenLink Software     Web: http://www.openlinksw.com

Sergey Chernyshev

