Re: XHTML: Suggestion to add a attribute for multi language documents from Laurens Holst on 2006-04-15 (www-html@w3.org from April 2006)

From: Laurens Holst <lholst@students.cs.uu.nl>
Date: Sun, 16 Apr 2006 00:27:38 +0200
To: "Jukka K. Korpela" <jkorpela@cs.tut.fi>
Cc: www-html@w3.org
Message-ID: <4441735A.1020403@students.cs.uu.nl>

Jukka K. Korpela schreef:
> I'm afraid language markup is a lost cause.

 From that point of view, hasn’t by now the Internet shown you that 
*any* kind of markup is a lost cause? Seriously, if there’s anything 
that the Google statistics showed, then it’s that people just mess 
around and any mistake that can be made is made, and grossly so.

> It is in reality much more reliable to deduce the language from the 
> actual content, heuristically. 

Seems that nowadays some think heuristics are the answer to all 
problems. Heuristically determine language. Heuristically determine 
content type. I’m afraid that in practice it will turn out that that 
doesn’t work either, and certainly isn’t interoperable.

Let’s not mark up our text documents either, why shouldn’t the browser 
heuristically determine the semantics, and the styling, too! Oh, and 
let’s use OCR on images instead of depending on the alt text. Authors 
will get it wrong anyway.

> It works with a handful of specialized browsers. The problem is that 
> the vast majority of pages don't do language markup, or do it _wrong_, 
> so even the small number of people using those browsers don't benefit 
> much. This in turn means that there's little motivation to authors to 
> use language markup.

The main problem is that page authors don’t see what they’re marking up. 
For something to be used in a correct manner, authors need to be able to 
notice that something is wrong.

Let me give an example. At work, yesterday, while updating the content 
the CEO came to me and said that the ‘tooltip text’ of an image was 
wrong. Of course, this was the alt text that was showing up in Internet 
Explorer. Point being, because the normally ‘hidden’ markup is made 
visible, the quality of the document improved.

Now I’m not directly advocating for showing the alt text, the problem is 
much broader anyway, but I think instead of doing everything 
heuristically, I think to improve the markup we rather need to make 
tools which properly visualise *all* aspects of the document, not just 
the usual visible parts, and also have integrated audible tools which 
help with quickly detecting errors in language markup, alt and other 
kinds of markup (think tables).

There is also a role for browser vendors to play here, not just for 
authoring tools. I’d say *especially* for browser vendors, as their 
products are usually the main tool documents are authored with. The 
speech thing in Opera is a nice example of this, it may seem a bit silly 
but such tools could be a great help to authors improving the quality of 
their documents.

But it’s not integrated enough with people’s browsing process yet, and 
only available in one minority browser (and I actually don’t know how 
well it handles other languages). To popularize this, it needs a ‘killer 
application’. If the masses would e.g. pick up voice-controlled 
browsing, and letting a browser read up web pages, the markup would 
improve instantly. But that’s of course the key to make any markup be 
used, make it useful to the majority of people.


~Grauw

-- 
Ushiko-san! Kimi wa doushite, Ushiko-san!!
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Laurens Holst, student, university of Utrecht, the Netherlands.
Website: www.grauw.nl. Backbase employee; www.backbase.com.

Received on Saturday, 15 April 2006 22:28:20 UTC