W3C home > Mailing lists > Public > public-lod@w3.org > March 2009

Re: LDOW2009 Workshop now publishing Linked Data (was Re: Linked Data on the Web (LDOW2009) workshop papers online.)

From: Ted Thibodeau Jr <tthibodeau@openlinksw.com>
Date: Fri, 20 Mar 2009 16:38:42 -0400
To: Kingsley Idehen <kidehen@openlinksw.com>
Message-Id: <333609A0-7183-48CC-8159-AC2AFA752719@openlinksw.com>
Cc: Knud Hinnerk Möller <knud.moeller@deri.org>, giovanni.tummarello@deri.org, Tom Heath <Tom.Heath@talis.com>, public-lod@w3.org
**Knud Hinnerk Möller wrote:
 >> - not having to worry about serving the data after you have
 >> produced it

* Kingsley Idehen wrote:
 > Well I may be worried about "sweat and brow" amongst other things
 > and would like to be attributed via some kind of digital emblem
 > that serves such purposes i.e. URIs that are bound to a domain
 > I control.

A minor language tweak...

"Sweat and brow" should read "sweat of brow" here -- shorthand for
"by the sweat of one's brow," meaning "by one's effort, by one's
hard work" -- as Kingsley's point is about attribution.

If some entity invests time, effort, money, and other resources in
building a data-set, which has many URIs coined therein, and another
entity then simply absorbs that data and replaces the original URIs
with their own -- not with owl:sameAs, but with forward-chaining and
potentially then backward-deleting -- then the original entity loses
all credit for the "sweat of brow" that went into the original data
set's creation.

The original creator may not want monetary compensation -- perhaps all
they are concerned with is reputation -- but that is equally lost if
the URIs they coined are not preserved.

Now, the data-set creators do need a namespace in which to coin their
URIs.  That might be through a service like purlz.net.  It might be
with a data-set base-URL, like dbpedia.org or bio2rdf.org.  It might
be through a vanity domain, ted.me (note -- this doesn't exist, but
it should if I use it for URI coinage).

The namespace *should* be in a real domain which the data-set creator
controls, or at least which they realize is going to get "credit" (or
blame) for the statements (triples) being made -- and the dog food
server *might* be one such, in the cases immediately being discussed,
where the URIs are being created explicitly so that data may be
hosted there.

But if I create a data-set that provides info about people with URIs
in the dog food server's name-space, and I want credit for that data,
I should mint my own URIs in my own name-space and sameAs-relate them
back to the dog food server's name-space.  And hopefully, when I then
release my data-set under CC:Attribution or similar, my URIs will be
preserved even when my data is integrated into the dog food or any
other server.

(And ... in best practice, I should release *2* data sets.  One should
be my statements about the people -- e.g., {<name:Bill%20Clinton> a
<role:President> .}.  The second should be my assertions of data-set
inter-linkage -- e.g., {<name:Bill%20Clinton> <owl:sameAs>
<http://dbpedia.org/resource/Bill_Clinton> .}.  Blending these two
into one data set is poor practice, as my descriptions of a person
may be accurate (Senator, NY, Democrat), but my sameAs may be wrong
Hillary, not Bill).  It's easier to correct the interlinks -- or to
use someone else's set of interlink assertions -- when they're in
their own data-set.)

Be seeing you,


