Re: extracting semantics Re: Namespace

Karl Dubost wrote:
> Indeed a semantics extractor looking at the version of HTML
> 
>     <p>Life is <small>tough</small>.</p>
> 
> * HTML 4.01
>   SMALL: Renders text in a "small" font.
> * HTML 5.01
>   The small element represents small print (part of a document often 
> describing legal restrictions, such as copyrights or other 
> disadvantages), or other side comments.

That's a problem with HTML4.  It left the semantics almost completely 
undefined and so any use case for small text was effectively allowed. 
HTML5 somewhat restricts the use cases to a smaller subset, but the 
question is whether or not that is useful in practice?

> Then I'm an implementer of a semantics extractor. What are my 
> implementation strategies?

That isn't a real use case, since it depends entirely on the purpose of 
extracting that information, which you have not specified.  Extracting 
arbitrary information for no known purpose is not useful.

What are you trying to achieve by extracting the content of small 
elements?  What information are you looking for?  Why is that 
information useful?  Are there any existing tools that attempt to 
extract that type of information?

As I said before, whether or not <small> is currently used for legal or 
copyright info in practice is questionable, but it's also questionable 
whether having specific markup for legal information is at all useful.

So I don't believe the question about whether or not to retain small has 
anything to do with it's semantic compatibility, but whether or not it 
fulfils a useful purpose both in theory and in practice.

-- 
Lachlan Hunt
http://lachy.id.au/

Received on Wednesday, 18 July 2007 04:29:11 UTC