- From: Mark Birbeck <mark.birbeck@x-port.net>
- Date: Mon, 7 May 2007 14:22:07 +0100
- To: public-html@w3.org
- Cc: www-html@w3.org
Henri, > The point is exploring *if* they can be interpreted using existing > practice as a guide. This is subtly different from just using a > dictionary: if research showed that a non-word string was > consistently used to denote something useful, a dictionary would not > have to be involved. Since it is improbable that the string > "copyright" would appear accidentally without someone thinking of the > concept of copyright while writing the string, it is reasonable to > assume that the string is motivated by the concept. But the logical error that is being made by the proposal is to conclude that you are able to _infer_ one author's intent from that of others. Since there is nothing in the HTML spec that says that @class="copyright" means _anything_ at a global level, then even if 100 authors use it in the way that is being suggested, you *cannot* infer from that anything about author 101. (And that is putting aside Bjoern's perfectly correct points that the spec doesn't actually say what @class="copyright" means anyway.) USE CASES AND REQUIREMENTS Don't get me wrong though--the fact that 'copyright' is an oft-used class name is still a very useful piece of information. It tells us that copyright information is something that authors (or at least their publishing software) are quite prepared to add to their documents, and so it gives us a clue that if we could come up with a standard way of allowing this, we'd end up with some very useful metadata to make use of. But I seriously worry about what kind of standard will emerge when this very useful use-case is being used not to tell us that there is a requirement, but to tell us how to actually provide a solution to that requirement--i.e., to justify defining that @class="copyright" now has _global_ semantics when before it didn't. If we were talking about @rel="copyright" that would be somewhat different, since @rel indicates a relationship between two documents, and authors would be consciously using a globally valid technique to add metadata. But the definition of @class tells us only that it can be used for semantics, and tells us anything about the values. INFERRING MEANING Of course, in some specific environment, it might be useful to infer that @class="copyright" has meaning. Google, for example, could make use of this, and show us the text in the search results, from any element that has this class value. But in this situation, if they get it wrong, it's not the end of the world. And more than that, they could use other rules on their servers to process the document and the element's content, and work out whether some element really is a copyright message. But 'inferring' meaning from documents in this way to aid processing is a far cry from inferring the syntax. And this is because, as others have said, @class cannot *by definition* be deemed to be unambiguous. XHTML 2 It's frustrating to see the very discussions that we've had over the years in the XHTML 2 work now happening all over again, but I guess that is almost inevitable when politics is such a key part of standards-making--so there's no point in complaining. :) In this particular area we also considered tweaking things so that existing @class values had universal meaning, but in the end we concluded that all we could say about _existing_ @class values was that they were 'locally defined', i.e., they were private to the author. Since there was no mechanism for an author to indicate that they had chosen some 'global' value, then it would be incorrect of us to assume that they had. However, that is not the case with @rel and @rev, since there are some predefined values. And it is also not the case with @role, since that says that non-prefixed values are reserved (to answer Geoffrey's question). And finally, we felt it was also not the case with values of @class that look like XML QNames, i.e., are 'qualified'; we felt that an author using 'dc:creator' as a class value almost certainly knew what they were writing (since it is currently not a very common practice), and so it was legitimate to 'infer' something more from this than some CSS styling rules. CONCLUSIONS So firstly, I would say that there is not necessarily anything wrong with deriving 'universal' meaning from @class values, but we can't unfortunately say anything about pre-existing values. Going forward, any values that want to be part of some 'global' dictionary need to be 'qualified' in some way. This has been discussed in this thread, with one option being to find a unique prefix. That's not a bad solution, but as every language designer knows, no matter what you are doing, you very quickly come up against the problem of namespacing. Our approach was therefore to use an XML namespace style approach, with the CURIE syntax being proposed to support this: <http://www.w3.org/TR/curie/> But secondly, I would say that using @class rather than the role attribute to carry values that are about the structure of a document, could appear to be an example of the 'not invented here' mindset. I'm sure it's not, but I would urge people to consider @role for this task, since: * @role was created specifically to allow authors to say what an element's purpose is; * it was further motivated by trying to provide an 'unpolluted' value space so that there would be no ambiguities; * it is available as a standalone module that can be used in different mark-up languages; * it has been added to Firefox already. Regards, Mark -- Mark Birbeck, formsPlayer mark.birbeck@x-port.net | +44 (0) 20 7689 9232 http://www.formsPlayer.com | http://internet-apps.blogspot.com standards. innovation.
Received on Monday, 7 May 2007 13:22:27 UTC