- From: Eric Hellman <openurl@gmail.com>
- Date: Wed, 24 Jun 2009 14:41:35 -0400
- To: Leigh Dodds <leigh.dodds@talis.com>
- Cc: Kingsley Idehen <kidehen@openlinksw.com>, Ian Davis <lists@iandavis.com>, public-lod@w3.org
- Message-Id: <56C991B4-23B2-4899-BDBB-81F52182B3BD@gmail.com>
I'd like to step in here and add my 2 cents. My background in this is that of have started a company that produced a knowledgebase aggregation. We were a bit early to take advantage of most of the semantic web technologies, but we definitely made use of a lot of the early intellectual foundations. I must agree with Kingsley that it's extremely important for the success of the Linked Data meme that we enable an "economy of attribution". I would argue however, that the accidental weak attribution provided by uri's is a sad excuse for a real provenance infrastructure- after all, a site may have to assert a triple to be able to say it's false. Or it may need different types of attribution for different parts of its data space. We can do better, and I don't mean by reifying everything. It's interesting that Freebase was brought up, because it saves an identifier and an origin with every tuple- tying attribution (and licensing, for that matter) to globally identified tuples makes a lot more sense that attaching to the entities themselves; after all we want to make the entities reusable and to not be producing superfluous sameas's just to make your attribution economy work. I must agree with Leigh that you really need concordance of intent the the legal facts of licenses; you can't use copyright to protect facts. However, gets more complicated than that. In the US, it's also not possible to copyright collections of facts which can in fact be copyrighted in Europe under the "Sweat of the Brow" doctrine. http://en.wikipedia.org/wiki/Sweat_of_the_brow So Copyright protection (and thus licenses including GPL and CC) can be asserted on entire dataspaces, but that protection is invalid in the United States. A statement of Ian Davis' caught my attention: "Having datasets require attribution will negate one of the linked data web's greatest strengths: the simplicity of remixing and reusing data." I would argue that when data loses attribution, it becomes impossible to judge the reliability of that data, and thus loses most of its worth and that the lack of strong attribution is the linked data web's greatest weakness- and that while what he says is true, it doesn't HAVE TO be true, as the are multiple approaches to addressing attribution that don't impact remix or reuse simplicity. I've blogged on some of these issues at http://go-to-hellman.blogspot.com/ Eric On Jun 24, 2009, at 12:04 PM, Leigh Dodds wrote: > 2009/6/24 Kingsley Idehen <kidehen@openlinksw.com>: >> My comments are still fundamentally about my preference for CC-BY- >> SA. Hence >> the transcopyright reference :-) > > Unfortunately your preference doesn't actually it make it legally > applicable to data and databases. The problem, as I see it, at the > moment is that this is what the majority of people are doing: using a > CC license to capture their desire or intent with respect to > licensing, rights waivers, attribution, intended uses, etc. The > disconnect is between what people want to do with the license, and > what's actually supported in law. > >> I want Linked Data to have its GPL equivalent; a license scheme that: >> >> 1. protects the rights of data contributors; >> 2. easy to express; >> 3. easy to adhere to; >> 4. easy to enforce. > > Then the best way to do this is to engage with the communities that > are attempting to do exactly that: the open data commons and creative > commons. We shouldn't be encouraging people to do the wrong thing and > use licenses and waivers that don't actually do what they want them to > do. The science commons protocol is a good example of best practices > w.r.t data licensing that are being agreed to within a specific > community; one that has a a long standing culture of citation and > attribution. > > IMHO much of the advice and reasoning that has gone into the > definition and publishing of the science commons protocol is > applicable to the the web of data as a whole. Convergence on a commons > -- which can still support and encourage attribution through community > norms -- is a Good Thing. > >> As I stated during one of the Semtech 2009 sessions. HTTP URIs >> provide a >> closed loop re. the above. When you visit my data space you leave >> your >> fingerprints in my HTTP logs. I can follow the log back to your >> resources to >> see if you are conforming with my terms. I can compare the data in >> your >> resource against my and sniff out if you are attributing your data >> sources >> (what you got from me) correctly. >> >> If all the major media companies grok the above, there will be far >> less >> resistance to publishing linked data since they will actually have >> better >> comprehension of its inherent virtues and positive impact on their >> bottom >> line. > > I'm not sure that understanding the value of a unique uri for every > resource, and the benefits of a larger surface area of their website, > is the primary barrier to entry for those companies. One might build > similar arguments around SEO and APIs. IMO, the understanding has to > come through the network effects created by opening up the data for > widest possible reuse. Clear and liberal licensing is a part of that. > > Cheers, > > L. > > -- > Leigh Dodds > Programme Manager, Talis Platform > Talis > leigh.dodds@talis.com > http://www.talis.com > Eric Hellman President, Gluejar, Inc. 41 Watchung Plaza, #132 Montclair, NJ 07042 USA eric@hellman.net http://go-to-hellman.blogspot.com/
Received on Thursday, 25 June 2009 06:25:53 UTC