Re: [whatwg] Creative Commons Rights Expression Language from Ben Adida on 2008-08-26 (www-archive@w3.org from August 2008)

From: Ben Adida <ben@adida.net>
Date: Tue, 26 Aug 2008 08:54:51 -0700
To: "Bonner, Matt" <matt.bonner@hp.com>
CC: Kristof Zelechovski <giecrilj@stegny.2a.pl>, 'Julian Reschke' <julian.reschke@gmx.de>, 'Ian Hickson' <ian@hixie.ch>, 'Dan Brickley' <danbri@danbri.org>, "'Tab Atkins Jr.'" <jackalmage@gmail.com>, 'Henri Sivonen' <hsivonen@iki.fi>, "www-archive@w3.org" <www-archive@w3.org>
Message-ID: <48B4274B.1050106@adida.net>

Bonner, Matt wrote:
> Doesn't having the license info in multiple places contradict DRY?

Yes, it does, indeed. I agree that the in-media situation is
sub-optimal, though adding RDFa in the HTML doesn't hurt the situation
too badly: users are likely to provide licensing information in plain
English in the enclosing HTML anyways, and RDFa just makes that same
information machine-readable.

More importantly, the in-HTML effort -- RDFa -- can be dissociated from
this particular use case, because the data is already rendered in HTML.
Consider:

- craigslist listings
- contact information
- CC license information on blogs
- the music example from Manu

All of these have significant structured data that could be exploited in
many useful ways, if only tools could get at the structure.

Btw, I don't think I've linked to the RDFa Primer (although Ian should
be happy to hear that it's the first result when you google "rdfa" or
"RDFa"):

  http://www.w3.org/TR/xhtml-rdfa-primer/

I recommend the new Editors' Draft, which is only a slight tweak, but is
clearer on the issue of HTML vs. XHTML:

  http://www.w3.org/2006/07/SWD/RDFa/primer/20080813/

> It seems like (again, as I think Chris was saying) that each document 
> should be solely responsible for its own license information. Why repeat
> those data in a new page rather than simply have link to the original
> page like we've all been doing in HTML since the beginning?

So, take for example a Flickr photo listing page, with 20 photos on the
page. It would be quite useful if that page gave the license for each
photo, so you can very quickly tell which are usable for commercial
purposes, for example. (And you need to associate the licensing info
with the region on the page you care about, because the program can't
tell you which photos are good, only your eye can.)

> But they need to understand IP law and the remixing rules to avoid 
> "copy/paste rot", no?

Ideally, no. A big goal of the metadata effort at Creative Commons is to
eventually automate this copy/paste stuff, by pointing out potential
license conflicts, etc... We're not there yet, but having the metadata
expressible in machine-readable form on the web is a clear first step.

> For example, what if I took 3 poems w/ 3 permissive CC licenses, and
> made a "new" poem that combined them by interleaving the stanzas?  I
> need to understand how to put or remix correct ccREL RDFa data around
> that.

And in a world with RDFa, a lot of that can be automated. We have a
"wizard" for creating RDFa for a single work, but it's not too hard to
build one for "remixed content" where you cite your sources, and voila
here's a chunk of HTML+RDFa that says the right thing.

Eventually, it can be created entirely automatically, if web authoring
tools support RDFa. But it doesn't have to be fully automated, partial
automation is very useful already.

> If another author takes that poem and interleaves a 4th, original
> poem in stanza by stanza, she needs to figure out whether to include 
> attribution for my interleaving, plus how to add/remix her ccREL data...
> Similar problem if I create a collage of CC photos, remix of 3 CC songs,
> etc. Copy/paste gets you nothing but trouble here.

You're absolutely right that copy-paste will always be complicated,
thanks to complicated copyright law. We're trying to build licenses that
make it easier, and tools that make it easier to understand and use the
licenses.

> Likewise, what if I mixed CC content w/ small amounts of non-CC content
> in a way I considered fair use?

Fair use is unlikely to be machine-decidable. But an automated tool
could easily prompt you: "some parts of this are not known to be
licensed, you should check that this is fair use."

We're not expecting full-on magic here, just helpful programs driven by
structured data.

> Rather than answer here, where it only helps me understand, I would 
> propose clarification in the ccREL Submission on proper remixing of 
> ccREL blocks for mixed content.

We purposefully stayed away from this at this point, because of its
complexity and the fact that most folks are still thinking about what I
call "first generation CC content." We will certainly write something up
on "second+ generation CC content," and your prompting certainly upped
the priority on that task.

Thanks very much for this feedback! I hope my answers were helpful in
describing what we're hoping to accomplish.

-Ben

Received on Tuesday, 26 August 2008 15:55:30 UTC