W3C home > Mailing lists > Public > whatwg@whatwg.org > December 2006

[whatwg] Semantic styling languages in the guise of HTML attributes.

From: Matthew Paul Thomas <mpt@myrealbox.com>
Date: Tue, 26 Dec 2006 01:50:31 +1300
Message-ID: <5b43481c298433d038bf4879ebff7ba7@myrealbox.com>
On Dec 22, 2006, at 3:23 AM, Benjamin Hawkes-Lewis wrote:
>
> Henri Sivonen wrote:
> ...
>> Also, it seems to me that the usefulness of non-heuristic machine 
>> consumption of semantic roles of things like dialogs, names of 
>> vessels, biological taxonomical names, quotations, etc. has been 
>> vastly exaggerated.
>
> I'm not entirely sure what "non-heuristic machine consumption" is,

An example of non-heuristic machine consumption is where Google 
Glossary thinks: "In an HTML 3.2 or earlier document containing the 
code '<dl><dt>foo<dt> <dd>bar</dd></dl>', 'bar' is a definition of 
'foo'". (It probably thinks the same about HTML 4 documents, too, which 
is applying a small "ignore that nonsense about dialogues" heuristic.)

An example of heuristic machine consumption is where Google Glossary 
thinks: "In an HTML document containing the code '<p><b>foo:</b> 
bar</p>', 'bar' is probably a definition of 'foo', especially if the 
page has several consecutive paragraphs with that structure and 
different bold text."

Non-heuristic machine consumption fails when semantic elements are 
abused, and becomes practical when elements have multiple popular 
meanings (examples of the latter include <dl> in HTML 4, and <p> in 
HTML 5). Heuristic machine consumption fails occasionally by the very 
nature of heuristics (examples currently include
<http://www.google.com/search?q=define:author> and
<http://www.google.com/search?q=define:editor>.)

-- 
Matthew Paul Thomas
http://mpt.net.nz/
Received on Monday, 25 December 2006 04:50:31 UTC

This archive was generated by hypermail 2.3.1 : Monday, 13 April 2015 23:08:31 UTC