- From: Ben Adida <ben@adida.net>
- Date: Fri, 22 Aug 2008 12:53:49 -0700
[Response to Ian and Henri in one email... but then I saw the other responses and am breaking out the remainder responses in separate emails.] Ian Hickson wrote: > I've whitelisted your e-mail address so that you can post to the WHATWG > list without subscribing. Thanks Ian, I think I unsubscribed a while back when I was busy with other things, but really I should subscribe at this point, there's no reason for me to have special status. > However, if the e-mails on this thread were > intended to be a request that the RDFa attributes be considered for HTML5, > I must admit to having misunderstood the request. Though I do think you should consider RDFa attributes in HTML5, I didn't mean to start this thread just yet (we're in the middle of our transition to Proposed Rec at W3C for RDFa in XHTML 1.1). I believe it started when Matt mentioned ccREL and wondered what it would take to support it in HTML5. So, heck, not ideal timing on my end, but since the discussion has begun, let's go for it :) > I've addressed RDFa only in this message (as opposed to creative commons > markup). It would be helpful if you could send a separate message that is > specifically asking for the changes you desire Perfectly reasonable: we'll put together a precise proposal regarding: (1) what would need to validate, (2) what would browsers be expected to do, and (3) why we think this is useful. > That's weird. I wonder why these people are asking Creative Commons for > these tools and not asking other communities (e.g. the WHATWG community). My guess is that people prefer to ask their own, smaller, community for "best practices" and solutions, rather than try to find the underlying standard that is the limiting factor. That's one of the reasons we believe in RDFa: the underlying standard implements a small handful of attributes, and individual communities get to manage their own extensions by building, reusing, and extending vocabularies. > Usually I find that when there is a need, the people with that need > approach multiple different groups trying to get their need met. I'm not sure that applies for folks who are a bit less technical: they wouldn't approach WHATWG when all they're thinking about is CC + genomics, for example. > I'm also curious as to why, if this is so commonly requested, similar > features such as hCard and hCalendar have seen limited uptake. I suspect one of the reasons is insufficient tools to make use of hCard and hCalendar in ways that are not already served by publishers adding a vcard or an ical link. And I would venture to say that this is because (1) tools have to be custom-built for every microformat, because the syntax varies, and (2) there's no reusability of data fields across microformats, which means having each one as a separate XML/CSV download is just fine. Also, there's no doubt that the data web will remain *much* smaller than the human web for a while. That doesn't mean it's irrelevant. Even 0.1% of the web is very big and potentially very useful. > Indeed, we have design principles that make addressing the needs of > small communities an explicit non-goal. How about adding one feature that will help make many small communities happy, each in their own way? That's the power of RDF, and the idea behind RDFa is to enable that distributed innovation within HTML. > But I haven't seen the level of interest that, say, video or > offline Web applications have had. I haven't even seen the level of > interest that random HTML elements like <abbr> have received. Not really comparable. We're trying to enable lots of applications in the long run, while video and offline are obviously very immediate short-term features. Also, <abbr> is an existing element, so sure if you try to kill existing stuff, you're going to get vocal protests. > The interest in technologies like RDF seems to be almost exclusively from people in the > metadata processing space. Until we have the syntax and then the tools that build on that syntax to make this more useful to end-users, that statement will remain true, indeed. But we have to have some foresight into what cool applications could be built if we just enable a few things, especially given the interest we're already seeing. Tapping into the power of RDF from within HTML is, in my opinion, one of those enabling approaches. > Use a unique name, e.g. include a domain name in the name, as in > "license.creativecommons.org" or "home.foaf.w3.org", or use a name you > know isn't used because it's an unusual name, e.g. "cc:license". That doesn't scale (unless you expect people to actually use GUIDs with timestamps), and it's extremely web-unfriendly, since you can't look up a concept to figure out what it might mean. The RDF folks figured out how to do this a while ago. Why not tap into that expertise a bit? > I honestly don't see significant interest in computer-readable metadata. But a lot of folks do, and it would cost HTML5 very little to let us all co-exist happily :) > But in any case HTML5 already has extension mechanisms, so the discussion > should not be over whether RDFa is worth it or not, the discussion should > be over what extension mechanisms RDF needs that HTML5 doesn't provide. Some problems with existing extension mechanisms: - no way to make statements about another document (a PDF), etc... in a way that is *consistent* across different data types. - no way to relate two chunks of data within a page, e.g. my friend Alice is the second cousin of my friend Bob. - no way to build reusable vocabularies. > The failures of the past have had little to do with the syntax or > expression mechanisms. They have to do with users simply not caring. They don't care because there are no useful tools for them to care about, because the tools are too difficult to write when you don't have a standard syntax that's generic enough. >>> With things like licensing metadata, where the person who benefits the >>> most isn't the person who writes the data, users simply aren't going >>> to bother doing a good job. >> That's an incorrect assumption. > > It's a verifiable fact! Just look at metadata like lang="", character > encoding information, Content-Type headers, etc. It's so unreliable that > any serious system that processes large amounts of data from multiple Web > authors always ends up ignoring the metadata (or at best using it as a > hint) and using heuristics to determine the real information. Your assumption is untrue when you get to the Creative Commons community, where lots of organizations and folks care about stating how to give them attribution. And certainly we're not the only area where people want ways to express this data (I'll mention the UK National Archives again, and folks like Manu Sporny working on audio markup.) HTML5 should be able to serve smaller communities than "the whole web." We're asking for a solution that is relevant to *lots* of small communities, each in their own way. > But as soon as this kind of thing is applied to people outside the > tightnit community, the metadata becomes an utter mess, misused, wrong, > missing, syntactically incorrect, semantically incorrect, unusable. We > have shown time and time again that when metadata mechanisms face the > wider Web community, they fail. Ignoring this doesn't make it go away. You're looking at this in a fundamentally broken way. We don't need it to be perfect, and we don't need everyone to do the right thing. We just need to *enable* people to do the right thing. And we don't need the whole web to do it, either. In other words, maybe this will never show up on Google's radar for the web as a whole. But it would be a mistake to conclude that it's not useful to a large number of folks. Do you think that everyone will use the Progress Bar the way you intend them to? No, of course not. But the ones who do use it in the proper way will get the benefits. Same goes for RDFa. > Note: I did read the ccREL paper before I wrote the previous message. Thanks for taking the time to do that, I sincerely appreciate it. I'm confused as to why you simplified our goal to "making a license statement", but I'm glad you read the paper :) Henri Sivonen writes: > It really isn't HTML5-friendly, since it depends on the namespace mapping context at a node. Well, we can discuss that part. But that's 10% of the syntax. The rest is all simple attributes with clear meaning. No change to the elements, no change to the structure of the HTML document. (And HTML already ignores extra attributes.) That's pretty close to HTML5-friendly, I think. Regarding the long discussion of "XML Namespaces." We don't use XML namespaces. We use CURIEs = Compact URIs. We've chosen to bind them to xmlns for now, but they are *not* XML namespaces. I disagree with you strongly on indirection. I don't think this sub-discussion is all that productive, though. > If Hixie made a proposal about HTML syntax citing Google's needs, but > there was something else going on at Google making the syntax moot, I > think it would be relevant. (I guess metadata aiding > translate.google.com is the recent example.) You're claiming that because one of our videos doesn't contain the URL in its actual content (though it does in its surrounding HTML, which is all that's needed), then we're contradicting ourselves? That's silly. I speak for CC in terms of metadata. Let me know where we are inconsistent, and I'll be sure to fix it. So far, though, you're makign some incorrect assumptions. > This doesn't allow you to say things about *another* resource, but > that's OK, because out-of-band metadata and data often travel their > separate ways. It's not okay for us. There are no good ways to embed metadata in media files that the average user can understand. So we need it in the enclosing HTML. With our approach, someone can take a chunk of HTML we give them, and paste it right in their page. We need that chunk of HTML to carry metadata with it. > For example, in PDF, do people *really* need all this cruft: People don't need it, machines do. You keep confusing the two, as if we're asking people to manually write out the chunk of RDF/XML to which you refer. That's highly misleading. Machines need this so the PDF can be traced to an origin URL that reverse-references it for consistency and some trust in where it's coming from. Then, the license statement can be automatically parsed by a tool that can then tell you "hey, make sure to give credit to 'Henri'." > Copyright is hard. Sprinkling URIs and angle brackets doesn't make > people grok copyright. RDF adds even more hardness that normal people > don't grok. Misleading and irrelevant. We're not trying to simplify copyright with RDF. We're trying to simplify copyright with programs that can help users through the process of licensing their content, reusing content, etc... Our programs use RDF to do so. People don't need to grok RDF, however. > I think trying to break complex licenses [...] Appreciate the feedback, but that's irrelevant to this conversation. We've got a whole bunch of lawyers and techies who've chosen a direction for how to help people with copyright, and we think overall we're doing a reasonable job. We're happy to get your feedback in a more appropriate forum, of course. However, I don't think a technical standards group should be discussing business model / marketing issues as part of its evaluation process. > If RDFa is considered immutable at this point, I guess HTML5 is put > in a "take it or leave it" situation. :-/ I'd choose leaving it if > taking it comes with the qnames-in-content and Namespaces in XML > baggage. RDFa in XHTML1.1 is immutable. RDFa in HTML5 is not immutable, though it would make little sense to change @property to, e.g., @data-property (and it would make implementation that much harder across versions of HTML). It might make sense to consider an additional alternative to @xmlns, which is something we're considering for non-XML HTML. If we had an attribute-value-only way of defining prefixes, would that make you happier? -Ben
Received on Friday, 22 August 2008 12:53:49 UTC