W3C home > Mailing lists > Public > www-validator@w3.org > March 2015

an odd question

From: <cj@mb-soft.com>
Date: Fri, 20 Mar 2015 11:00:30 -0500
Message-ID: <B165F4F883D7434DA2AF34A192CD2CE0@D9CDNW91>
To: <www-validator@w3.org>
I have an odd question.  The Validator seems to have what might be a flaw.  I operate a large web-site with thousands of web-pages.  Nearly all are either UTF-8 or Windows format.  Validator seems to get confused if it confronts even one byte of the wrong format, where it shuts down.  It seemed to me that a simple solution would be for Validator to "skip" that byte, or even consider trying to translate it into the other of those two formats, rather than totally abandoning the effort.

I  had thought I had a solution for my situation, since my web-pages are virtually identical in UTF-8 and Windows and Western, where only an occasional byte (sych as a Spanish tilden character, exists.  I thought I had solved the problem by using the &#176 type coding, which should be compatible with either UTF-8 or Windows.  

But I now get the impression that Search Engines get all fouled up.  For a word like Deja vu, I now have three different available spellings, English, UTF-8, Windows and &# format, and Search Engines seem to (sometimes) treat the four "spellings" as different.  I get different traffic reports for the identical page with those different spellings.

What is the best solution to this?

Also, I realize that Search Engines don't like "duplicate web-pages", so my "solution" probably needs to choose the "best of the four pages to have on the Internet.  Which one?


Pastor Carl
A Christ Walk Church
BELIEVE Religious Information Source web-site
http://mb-soft.com/believe/indexaz.html
Received on Friday, 20 March 2015 16:17:44 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 1 March 2016 14:18:12 UTC