Re: Misidentified language of an html document

 Ian Fieggen (ian@fieggen.com) wrote

> Nu Html Checker has misidentified the language of the following page:

As a practical move, you could just tell the checker to stop issuing
such messages, by using the “Message filtering” button in the user
interface.

As regards to this feature in the checker, I think it should simply be
removed. It causes more harm than anything useful. Occasionally, it
might help a person notice that he has used a wrong language code. But
so what? Is there any software that makes actual use of lang=...
attributes? Well, if you open an HTML document in Microsoft Word, it
might use such attributes to set language information, which in turn
may help in spelling checking, language-dependent editing features,
etc., but that’s rather special. For example, indexing robots seem to
ignore lang=... attributes, probably partly because they are so often
wrong (e.g. lang=en coming from an authoring tool or page template
default, no matter what the actual language is).

Probably the causes of some particular wrong analyses of language
could be found out and fixed, but I doubt whether that would be
useful. Whatever the checker uses as language guesser seems to be
inferior to generally available tools like language guessing in Google
Translate.

Jukka, https://jkorpela.fi/

Received on Wednesday, 12 June 2024 20:22:48 UTC