[whatwg] Spellchecking mark III

Aryeh Gregor ha scritto:
> On Wed, Jan 21, 2009 at 4:15 AM, Mikko Rantalainen
> <mikko.rantalainen at peda.net> wrote:
>   
>> If the browser does not know the language of the content, how on earth
>> is it supposed to *correctly* spellcheck it? I'm daily hitting a
>> situation where browser is trying to spellcheck content with incorrect
>> language. I've toggled such automatic spellchecker off and those will
>> stay off until correct language is detected.
>>     
>
> In practice, I think the only way to avoid this problem is for
> browsers to implement content-sniffing techniques of some kind to
> figure out the language, at least per field but ideally on a
> word-by-word basis.  If the browser is set to spellcheck in English
> but you start putting in lots of non-Latin characters and every word
> is therefore misspelled, the browser should be clever enough to try
> switching the spellcheck language, or at least disabling spellcheck
> for words that can't possibly be from the language it's checking
> against.  More refined heuristics could detect even subtle
> differences, like between British and American English, and remember
> for next time which one the user usually types in.
>
>   

Why not to let the user choose the language, as it happens in word 
processors? A UA can't choose accurately whether, for instance, "color" 
is a correct American English, a wrong British English, or even a 
correct (truncated) Italian word, while a human can do it better, thus a 
UA could provide an interface to change the language for a selection 
spellchecking, or even for each mispelled word, starting from a hint 
language, which could be the value of an element "lang" attribute 
(beside a default value and a user-preference "forced" one - the latter 
bypassing any authored value). Also, using the "lang" attribute value as 
the start language to check (if not in contrast with a user preference) 
would allow an interactive interface with a script changing that value 
according to a user's choice (UAs could also expose a list of supported 
languages).

A declaration such as "lang='und'" sounds like telling the user agent to 
do whatever is computed as being a good choice, which is different from 
telling "don't even try to understand what the language is here, because 
I know you can't guess it"; declaring a value known to be unsupported 
(such as an invented one) to turn off spellchecking sounds like a hack 
needed because we miss a more appropriate feature.

Everything IMHO.

WBR, Alex
 
 
 --
 Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP autenticato? GRATIS solo con Email.it http://www.email.it/f
 
 Sponsor:
 Partecipa al concorso Danone Activia e vinci MacBook Air e Nokia N96. Prova
 Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8548&d=22-1

Received on Wednesday, 21 January 2009 19:38:37 UTC