Re: Encoding and validation

Dominique Meeùs wrote:
> There are two related problems to be considered separately.
> 1. You have to choose a "physical" encoding: the different characters 
> have to be inscribed on the digital medium as a definite succession of 
> bits, forming bytes, like utf-8 or iso-suchandsuch… This is usually 
> obtained by an option under File/Save as… or another appropriate command
> 2. Most languages, protocols… ask you to declare the encoding so 
> chosen. This is some doctype or charset="" declaration.
>
> Needless to say that 1 and 2 have to be in accordance. Declaring utf-8 
> while you actually saved your document as Windows-1252 or some other 
> encoding of the middle ages is worse than declaring nothing. Most 
> software with a command to insert a declaration about encoding do just 
> this: declare, and only this. They do not convert the "physical" 
> encoding into another. (One exception: in Bluefish the command 
> Document/Encoding converts the encoding and inserts/corrects the 
> declaration if the encoding changes.)
> In conclusion, you have to mind 1 AND 2 accordingly.
I am suspicious, but do not know for sure, that I have indeed garbled 
these two together. Can I impose upon you to offer some specific steps 
in Amaya that would resolve mismatches that I might have committed?

This passes W# validation with no warning:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
      "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
    <meta http-equiv="content-type" content="text/html; charset=UTF-8" />
</head>

This passes with a warning (as noted above):

<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
      "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
  <meta http-equiv="content-type" content="text/html; charset=iso-8859-1" />
</head>

I'm aware I might just be taking another lap around the circle. As 
admitted earlier, I am out of my league here, so a step-by-step concrete 
reply would be appreciated.

Is it as simple as setting the encoding and charset to "UTF-8" and be 
done with it?

Regards and thank you,

Bill B

Received on Tuesday, 19 January 2010 16:03:35 UTC