Re: Discussion with Ian and Henri about HTML5+RDFa (part 2/2)

+cc: Sam Ruby

On 26/1/09 02:55, Ben Adida wrote:


> Manu, Henri,
>
> I appreciate the effort of this discussion, though I disagree with much
> of Henri's points (as I have in the past). More importantly, I think a
> number of these suggestions would do significant damage to the effort of
> embedding semantics in HTML, and to at least one important web design
> principle.

Which one? (a link into webarch spec would help here)

> And *most* importantly: the time for finding compromise on
> issues of personal taste has come and gone.

HTML5 is still a moving target, so there is inevitably some wiggle-room 
there as we define how (if at all) the RDFa work makes sense in that 
context. For example, how best to supply metadata for <audio 
src="spacemusic.mp3"
     autoplay="autoplay" loop="20000" ...> or <video>, using the results 
from the new Media Annotations WG 
(http://www.w3.org/2008/01/media-annotations-wg.html); how do we deal 
with multiple occurances of the <source> element - ie. 
http://www.w3.org/TR/html5/video.html#source  or its type attribute, or 
just look at @src as in the XHTML binding.


BTW what you dismiss as 'personal taste' is what some in the 
WHATWG/HTML5 scene consider to be 'editorial judgement'. Can you find 
language for continuing this conversation that sets that distinction 
aside for now?

> I'm bothered by this desire to redesign based on little evidence. The
> idea of specifically *not* allowing follow-your-nose flies in the face
> of much of W3C's work and the recent TAG publication on the
> self-describing web. High load on a W3C web server (due to poor
> implementations) is not evidence enough to undo a major design principle
> of web architecture.

Henri clarified this point in response to Manu:

Manu: "The example of the W3C serving up many, many gigabytes of the 
same HTML4.01 DTD every day was cited as an example of what happens when 
your "vocabulary" becomes popular."

Henri: "(To be precise, this is more of an issue with the XML DTDs than 
the HTML ones, because there are actual DTD-loading XML parsers out 
there.)  It isn't only about the ability of the vocabulary server to 
serve a lot of data. It's a problem when communication between two 
parties on the network is reliant on a third party keeping a service 
running."

On the follow-your-nose aspect, the proposal/profile I was exploring 
with Henri, and which his http://validator.nu/ tool now checks, doesn't 
abandon this idea. It just avoids shortcutting of URIs into short part / 
long part pairs. Partly (from my p.o.v.) because this verbose form is 
more robust under copy/paste, partly as a concensus-engineering hack to 
see if there is *any* profile of RDFa that both parties can stomach, but 
mostly to see if it can be made to work at all.

> (I can certainly agree with issuing some implementation guidelines that
> say "don't de-reference unless you need to.")

(Yup. Also digitally signing namespace documents provides some modest 
insurance against domain name loss / compromise. This (signing) theme is 
getting some attention via the widgets/webapps effort, it's worth 
keeping an eye on that.)

> On the issue of cut-and-paste: Creative Commons is, to my knowledge, the
> biggest publisher of RDFa, and we haven't had much trouble getting users
> to copy and paste proper RDFa. It's also been no problem getting folks
> to add more complex ideas, like attribution name and URL (in fact, many
> are pressing us to add more to our vocabulary, and we're being very
> careful to do that only after serious consideration.)

That's great. When I talked with Ian he was asking how many RDFa use 
cases were in the 'copy and paste something I don't understand' area 
(akin to .js widgets), versus copy/paste but edit and tweak, vs hand 
author etc. It is very good to have implementor feedback. Have you done 
any statistics to see what proportion of CC RDFa is still a sensible RDF 
graph, how often it is customised/tweaked and so on?

> The use case where someone copies and pastes partial HTML+RDFa from
> someone's existing web site and gets upset doesn't ring true: the same
> "problem" occurs in a much worse way with CSS, and no one seems to be
> too upset about that. In addition, in a lot of cases where it *would*
> make sense to copy and paste a chunk of HTML from one site to another
> (Creative Commons, widgets, etc...), the prefixes are declared in the
> same block of markup anyways.

(CSS has built in graceful degradation. Javascript might be a better 
example; if you copy someone's .js without the link to libraries, you 
lose all functionality.)

> It appears it mostly comes down to:
>
>> Henri was certainly sympathetic to embedding semantics in HTML for
>> everyone that needed the functionality (not just the 80% that
>> Microformats addresses) in HTML. He believes that removing CURIEs would
>> go a long way towards addressing his concern with the way RDFa is
>> currently implemented.
>
> Removing CURIEs is not an option at this point, given the existing
> standard, the existing deployment (by folks including Yahoo),
> backwards-incompatibility, and the lack of evidence for needing such a
> change at this point.

Well at the moment, neither CURIEs nor RDFa work in HTML5. A 
no-namespaces / no-CURIEs profile of RDFa *does* work in XHTML+RDFa 
today, and this imho is reasonable: nobody can force me to use 
abbrevations for URIs when I want to use URIs directly. However I need 
to use the xmlns:http="http" hack right now. I'd like to have a better 
sense for why this can't be made to go away.

> If we were only a few months into designing RDFa with no implementations
> or deployments, this discussion would make sense, as would some attempt
> at finding a compromise based on personal taste.

I share your frustration, but I think it is worth exploring a compromise 
subset, at least as an awkward starting point.

> But at this point, one has to present significant evidence of harm to
> undo what otherwise seems to be working just fine.

XHTML+RDFa can continue to work fine.

We're talking about HTML5+RDFa here, which right now is not working fine 
as a carrier for the kind of rich, structured, URI-disambiguated 
metadata we care about in RDFa circles.

IMHO the best way of fixing this situation is to explore common ground, 
and once we've identified some (my candidate: RDFa with long-form URIs 
instead of abbreviations), explore the costs and benefits of variations 
on that theme in terms of evidence. Perhaps stats from CC RDFa 
deployment, but also in terms of relevant stakeholders / use cases (I 
suggest we use the Semantic Web Lifesciences community, perhaps 
eGovernment too. And of course, Creative Commons are a major end-user 
too. Your observations are those of a group serving real needs; RDFa at 
CC is not a hammer looking for a nail. I hope to hear some 
acknowledgement of that from a few more HTML5 enthusiasts sometime.

This needn't be a gigantic project, but I suspect that having publishers 
& consumers of healthcare / lifesci HTML say "actually, we do need URI 
abbreviations in HTML metadata" is something that the HTML5 groups 
(WHATWG & W3C) ought to take seriously. The W3C HTML group has a new 
co-chair, Sam Ruby (cc:'d). I trust Sam to make sure such proposals get 
a fair hearing in the W3C part of the HTML5 world.

> -Ben
>
> PS: Note that I do agree on DOM consistency, and I suspect @prefix will
> fix that issue, as Manu mentioned. I've mentioned this to Henri in prior
> conversations, I believe.

There we go, collaboration and consensus :) Can you point HTML5 people 
at the relevant test cases around @prefix? Earlier you were sounding 
like everything is finished and frozen, but as I understand it the 
@prefix design is still being worked out. You just don't want to be 
starting over again with a complete redesign, which is understandable. 
I'm sure the HTML5 folk feel the same about their work.

Can you outline what the plan is there (re. dates, charters, goals 
etc.?), regarding @prefix: what's the roadmap for speccing @prefix?

cheers,

Dan

--
http://danbri.org/

Received on Monday, 26 January 2009 09:36:33 UTC