[whatwg] Creative Commons Rights Expression Language from Ben Adida on 2008-08-28 (public-whatwg-archive@w3.org from August 2008)

From: Ben Adida <ben@adida.net>
Date: Thu, 28 Aug 2008 09:39:43 -0700
Message-ID: <48B6D4CF.2010403@adida.net>
Ian Hickson wrote:
>> Did you listen to the video? It clearly states that they wrote a 
>> specific hack for Craigslist, but that they expect this to work more 
>> generically.
> 
> Sure, I'm just debating "needs". It is possible to do it without 
> structured data, indeed the flagship example here doesn't have any.

The video clearly states that they have a site-specific hack for now,
and how it would be better if they could instead parse something like
microformats.

It sounds like you're saying "it's not already deployed everywhere, so
we don't need to deploy it."

We're trying to put together the pieces to make it more easily deployable!

> To scale to the whole Web, the only thing I can see working is the 
> computers understanding human language. I just don't see the whole Web 
> marking up their data using fine grained semantic markup. We have enough 
> trouble getting them to use <h1> and <p>.

As Paul said well, I don't think the feature needs to be used by
everyone, no even close. How many publishers will really know how to use
the browser SQL? I'd say in the end, the potential # of publishers is
lower for browser SQL, because you need serious tech chops to make that
work, whereas RDFa is as easy as copying and pasting a chunk of HTML
that someone (like CC) gives you into your web page.

(Total number of *end-users* will surely be higher for SQL, given the
reach of gmail and Google in general, but you keep referring to
difficulty for the *publisher*, so it's important to point out how
difficult it's going to be to get offline+browser-SQL working for the
average publisher, especially compared to markup like RDFa which
typically requires just modifying a JSP/ASP/etc... template.)

> Examine the markup of this page (which I originally stumbled across a few 
> months ago, but which was updated just yesterday):
> 
>    http://puysl.com/view.htm

And by that reasoning, I think there are a lot of other HTML5 features
you need to kill, starting with browser SQL.

> Not everyone, no. Some, many even, will get the religion and mark up their 
> data in useful ways. But I don't see any evidence to suggest that a 
> critical mass will do so.

As I mentioned above, if you're talking about *publishers*, I think many
more will find RDFa useful before they find SQL-in-the-browser useful,
especially with client-side tools like Ubiquity.

> I absolutely see the value.

Okay, I think that's major progress: we agree that there's value :)

> I would absolutely love for the Semantic Web 
> vision to be the future. However, just because I want it to come true 
> doesn't mean it will come true.

How about letting it happen with a well-thought-out plan that tries to
grow semantics out of the existing Web, and seeing if it does succeed?
The cost is minimal, a number of publishers are interested, and the
tools are easy to build (9 implementations of RDFa parsers already, full
test suite, attribute-focused implementation, etc...)

> It fundamentally relies on humans acting 
> in a way that we _know_ they don't.

That's a false comparison. You're going back to the argument that there
is no user incentive or feedback for users to produce structured data.
But I just gave you two very high-profile examples: Ubiquity and
SearchMonkey. Both of those provide strong user incentive to play in the
structured data space, as long as that space is generic enough for small
publishers to hook in. Same tool, many publishers.

> We can't just ignore 18 years

18 years where we didn't have well thought-out metadata schemes for the
web, nor the client-side programmability of Firefox to stitch things
together. This is not the same old thing.

> I think (some hip) sites will totally plug in, just as they already have, 
> using site-specific scripts that can be downloaded by the users of those 
> sites. I think a few will use simple domain-specific fine grained markup 
> conventions (like Microformats); I think fewer still, possibly many but 
> likely not a critical mass, will use RDF and RDFa.

So you continue to confuse *publishers* and *end-users*. If you're
arguing that a small number of publishers means the feature shouldn't be
used, then you've got a number of features in HTML5 that need killing (SQL.)

> This mirrors what happens today (e.g. GMail and other big sites have 
> contacts APIs, a small number of sites have hCard, a very few have FOAF).

What happens today is limited by what's allowed in HTML. Your argument
is circular. We'd like RDFa to validate so people can feel more
comfortable adding it to their production sites.

> I don't see that tools like Ubiquity give any incentive to use RDF. The 
> immediate reward from a hard-coded site-specific script is more effective 
> than the compound reward of writing a generic script (typically a harder 
> task), convincing at least one site to rewrite its markup to use a 
> suitable convention, and then debugging the script to work around the bugs 
> that that site has, even if one eventually convinces multiple sites to 
> support the same conventions.

I couldn't disagree more. You're expressing a "tightly coupled" view of
the web, where tools know exactly where to get the data they need, and
there is no opportunistic connection of data from disparate sources. I
think you're missing a big part of the potential of loosely coupled Web
applications.

Imagine Ubiquity in 2 years. If it's tightly coupled to individual
sites, meaning all Ubiquity scripts are built as site-specific screen
scrapers, then a small publisher doesn't get to play, because they have
to somehow convince users to install their site-specific script. DOA.

Now imagine Ubiquity with RDFa support. Ubiquity scripts look for RDFa
and extract it generically. Now, a small publisher gets to play simply
by producing the same structured data as the big guys who did get their
scripts installed. That's how RDFa enables Web-scale. The big players
will define the vocabs, the small players can hook in and extend the
vocabs if they need without breaking existing applications.

So, is HTML5 built only for the big guys, or will it allow loosely
coupled metadata so the small guys can play, too?

> (Also, note that as much as things like Ubiquity are great for people like 
> us, they, like Quicksilver before it, and the Unix command line before 
> that, would totally confuse "regular" users. The concept of using a site 
> for a single task, and copying the output of that site into another site, 
> resonates with users in a way that "just trust us, if you tell the 
> computer what you want it'll do it" somehow doesn't. If power like 
> Ubiquity is the goal, we haven't yet found the UI for it.)

Maybe you're right. But if we don't enable these types of tools with
Web-scale structured data, we won't get to build the new interesting UIs.

HTML is a platform on top of which innovative apps can be built. With
RDFa, a new category of innovative apps can be built. Are they built
yet? Only as prototypes with site-specific scrapers. We're trying to
enable these tools with generic parsers so that new sites can drive new
apps which can drive new apps, ... loosely coupled data is hugely powerful.

-Ben
Received on Thursday, 28 August 2008 09:39:43 UTC