- From: Mike Schinkel <mikeschinkel@gmail.com>
- Date: Wed, 27 Dec 2006 22:58:04 -0500
James Graham wrote: > Actually, IMHO mpt's point is far broader and consequentially > more important than the confines of the original thread. The > point, as I understand it, is that machine analysis of > "semantic" markup fails if the markup construct is (ab)used > in so many different ways that the interpretation of any > particular fragment is no longer unambiguous. This is a sort > of "heat[1] death" of the original semantics... It's ironic that you use the term "entropy" here.[1] Anyway, although in general I agree with you, you speak in generalities so it is hard to either concur or disprove your assertions. > as the use of > an element becomes increasingly disordered (i.e. higher > entropy), it becomes impossible to extract any useful > information from the use of that element. So I'd like to see some specific examples of who you would see things evolve to the "inevitable" impossibilty? That said, one of my biggest qualms about "microformats" per se is how they have defined their community process. I believe their process is likely to generate more "entropy death" than less. I proposed alternatives, but they claimed those alternatives were counter to their vision. Thus I plan to use "microformat-like semantic markup" even though it wouldn't be microformats proper. But that's an entirely different discussion that I'm almost but not quite prepared to discuss. So I think the real question is this: is it possible or impossible to define a process for "microformat-like semantic markup" that can minimize the chance of "entropy death?" To answer the question one should understand that a.) even prior to the emergence of "microformat-like semantic markup" we've had lots and lots of disorder anyway, and 2.) seeing the train speeding to the end of it's tracks doesn't mean we can stop the train if we want to. On point #2, I still assert it's more pragmatic and hence better to work to minimize the damage than to scold the train for "stupidly" speeding up when approaching the end of it's tracks. > * Have enough elements. If there are obvious holes that > people can't fill with existing elements used properly, they > will reuse existing elements in new ways so increasing their entropy. Agreed. That's what we get for pursuing pie-in-the-sky semantic web exclusively while ignoring the evolution of HTML, for how long? Also it's what we get now for trying to put everything into HTML5 instead of planning to rapidly release 5, 6, 7, etc. > * Don't have too many elements: If there are too many > elements people won't understand them all and will reuse > existing elements in the "wrong" way, so increasing their entropy. <Elements> or @attributes? Anyway, I doubt there will be misuse if the <Elements>/@attributes have clear semantics other than possibly people not using them when they could have. Of course elements with names like <div> and <span> (what were they thinking when they named those?!?) are the type I believe you are referring to. > * Make the semantics of elements well defined: Start the > elements in a "low entropy" i.e. highly ordered state. Make > it obvious how the element is intended to be used (and > restrict the valid uses to ones that can be discriminated by > machine) so that fewer people accidentally abuse it. Interestingly, Dion Hichcliffe had a great article[2] that argued the best way to get a good outcome is to minimize structure at the beginning until the patterns emerge, then layer structure on top of those patterns. Think of the wiki. At the beginning, it was "the simpliest thing that would work." Had someone architected it in advance of use, they would have ended up with Lotus Notes! :-) And although Notes was sold to lots of corporations, Mediawiki is far more usuable for average people than Notes; the latter takes a salesmen to convince IT and then an IT staff to deliver edicts that "thou shalt use." While his article focused on entreprise intranets, one could argue that microformats simply might be the way of letting the world to the design for the needs of future HTML, assuming the next version of HTML empowers people enough to do so, and that we don't have a wait another decade before HTML6. > * Have some "high entropy" elements. This is the > counterintuitive one. > The goal, remember, is to extract as much information as > possible from the semantically well-defined elements. > However, in many situations there will not be a relevant > element to use, the publishing setup will not be optimized > for selecting the correct semantic element (think WYSIWYG > editors), or the author will not be sufficiently familiar > with the language semantics to make a well-informed choice > about the right element to use. In this case providing (and > encouraging the use of!) a set of high entropy "bit-bucket" > elements that are semantically meaningless is very > beneficial because they prevent the entropy increase > associated with the abuse of the semantic elements. The > increasing misuse of <em> as a "more semantic" <i> is an > example of what happens when this policy is not followed. Hmm. I think in this last paragraph you made the point I just typed prior to reading the last paragraph! > * Allow easy extensions. Having an extension mechanism for > those who need more functionality is one way to stop the > abuse of existing elements. This has to be sufficiently easy > to use that the it can be widely adopted but powerful enough > that it can replicate all the semantic features of the host language. YES!!! (can you tell I agree? :-) I would actually love to be involved in designing those as I've done some preliminary work on them, but only if there was a very good chance we'd get to see extension elements added. I can't afford to spin my wheels that much just to pontificate. -- -Mike Schinkel http://www.mikeschinkel.com/blogs/ http://www.welldesignedurls.org/ [1] I came to believe I had realized some aspects of software and development several years ago, and I registered "softwareentropy.com" with plans to blog about it until I had enough content for a book. That project is still on the backburner, unfortunately. :-( [2] http://blogs.zdnet.com/Hinchcliffe/?p=57
Received on Wednesday, 27 December 2006 19:58:04 UTC