- From: Michael[tm] Smith <mike@w3.org>
- Date: Mon, 26 Sep 2016 21:30:58 +0900
- To: Marcus Beyer <contact@take-a-screenshot.org>
- Cc: www-validator@w3.org
- Message-ID: <20160926123058.o7pz6emailygqcvo@sideshowbarker.net>
Marcus Beyer <contact@take-a-screenshot.org>, 2016-09-22 22:43 +0200: > Archived-At: <http://www.w3.org/mid/3D58E0B8-3886-4CD4-A35B-CDD2B2DFC227@take-a-screenshot.org> > > Nu Html Checker thinks my Chinese page is in English: > > https://validator.w3.org/nu/?showoutline=yes&showimagereport=yes&doc=http%3A%2F%2Fwww.take-a-screenshot.org%2Fzh%2F Thanks for taking time to report this. I’ve pushed a change that should cause the checker to no longer misidentify the language of that document. > I’m sorry, but this is not correct. Yeah for a small number of cases—mostly for documents with a relatively small amount of text—the checker sometimes misidentifies the language. I’ve dealt with it for now by raising the minimum number of (non-whitespace) characters it needs to see before it will attempt to do language detection. I previously had that number set to 256 characters but have now raised it to 512. But it may be that I still need to raise it further. So if you run into other cases where it is still misidentifying the language of any document, please do report it on this mailing list or at https://github.com/validator/validator/issues/new —Mike -- Michael[tm] Smith https://people.w3.org/mike
Received on Monday, 26 September 2016 12:31:33 UTC