W3C home > Mailing lists > Public > www-validator@w3.org > December 2016

Re: peport`s error

From: Jukka K. Korpela <jkorpela@cs.tut.fi>
Date: Tue, 27 Dec 2016 16:37:21 +0200
To: Алена Гордиенко <Alena.Gordienko@vm.ua>, "'www-validator@w3.org'" <www-validator@w3.org>
Message-ID: <069c86ef-8765-5c87-30dd-190fd5fbe8d6@cs.tut.fi>
21.12.2016, 11:53, Алена Гордиенко wrote:

> This Is link at report`s result of my site.
>
> https://validator.w3.org/nu/?showsource=yes&doc=http%3A%2F%2Fpatronservice.ua#l1345c165

This is rather mysterious. But first let me point at a different 
mystery: the message was sent December 21st and received by a w3.org 
server same day, yet distributed to subscribers of the list December 
27th. I have no idea of the cause of such delays (which have happened in 
the post, but not this long).

> *Warning**: This document appears to be written in Estonian but
> the *|html|* **start tag has *|lang="ru"|*. Consider
> using *|lang="et"|* **(or variant) instead.***

(The page has been changed so that it now contains lang="ru-RU", but 
this hardly affects the issue.)

Wrong language guesses by the validator are not uncommon, but usually 
there is a simple explanation, like hidden textual content at the start 
of the document, in a language different from the main language of the 
page. Here, however, we have a mystery. The page content (even as seen 
by a validator) is almost exclusively in Russian, with just a few short 
strings in Latin letters here and there. So how can a language analysis 
guess that it is in Estonian, which is written in Latin letters?

I tried a divide and conquer method to isolate the cause of the problem, 
by removing parts of the content until the wrong warning disappears. I 
finally noticed that if I just remove the short <div> element with 
class="header_contact" from the document, the warning is no longer 
issued. But it’s something more complicated, since when I test that 
<div> element (wrapped as an HTML document), it validates without warnings:

<!doctype html>
<title>Test</title>
         <div class="header_contact">
           <div class="header_icon" style="background: 
url(/user/img/icon_mobile.png); height: 40px; width: 40px;"></div>

           <div class="header_phone_new">
             0 800 509 278<br/><span>техническая поддержка<br/>ТМ Patron 
и Barva<br/>9:00-20:00, Сб 10:00-18:00</span>
           </div>


           <div class="header_small" style="margin: 2px 0 0;">
             <a onclick="_gaq.push(['_trackEvent', 'Home Page', 
'ReCall', 'Патрон Сервис|']);" id="recall" class="recall_shop" style='' 
href="#">
             Вам перезвонить?</a>
           </div>
        </div>

I wonder what content there might confuse a language guesser so badly, 
when the content is present in a context of a page in Russian, but not 
when tested in isolation. And there is no Estonian word there.

Yucca
Received on Tuesday, 27 December 2016 14:37:57 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 27 December 2016 14:38:00 UTC