W3C home > Mailing lists > Public > www-validator@w3.org > June 2009

Re: utf8 validator confusion

From: David Dorward <david@dorward.me.uk>
Date: Mon, 22 Jun 2009 11:26:32 +0100 (BST)
Message-ID: <61187.132.185.144.121.1245666392.squirrel@malt.us-lot.org>
To: "Sean" <sean@mediamice.net>
Cc: www-validator@w3.org
Sean wrote:
> My encoding is UTF-8 but My Server is showing as UTF-8.

Your server says:

 "Content-type: text/xml"

If the Atom feed is using UTF-8, it should be:

 "Content-type: application/xml; charset=utf-8"

On the subject of which - you seem to be using a hybrid of Atom and RSS. I
suggest you switch to straight Atom - it should simplify matters (and,
AFAIK, RSS offers nothing you can't find in Atom). If you switch to Atom,
use application/atom+xml rather then a generic application/xml.

The feed itself, however, says:

 "<?xml version="1.0" encoding="iso-8859-1"?>"

If it is UTF-8 it should read:

 "<?xml version="1.0" encoding="utf-8"?>"

... or, since UTF-8 is the default:

 "<?xml version="1.0"?>"

... or, since 1.0 is the default:

 ""

> Also if characters are used then it fails validation. Eg This “ type of
> “character” , or 50, or ?.

After eyeballing the feed, I can't see any data in there which isn't
straight ASCII so I can't tell if it is using UTF-8, ISO-8859-1 or
something else.

Whatever encoding you are actually using doesn't match the one that the
validator thinks you are using (which I assume, given the xml prolog,
would be ISO-8859-1).

-- 
David Dorward
http://dorward.me.uk/
Received on Monday, 22 June 2009 10:27:05 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:14:35 GMT