W3C home > Mailing lists > Public > www-html@w3.org > June 2006

Re: Problem in publishing multilingual HTML document on web in UTF-8 encoding

From: Laurens Holst <lholst@students.cs.uu.nl>
Date: Fri, 02 Jun 2006 23:09:38 +0200
Message-ID: <4480A912.7020700@students.cs.uu.nl>
To: Philip TAYLOR <P.Taylor@Rhul.Ac.Uk>
Cc: "आशीष शुक्ला \"Wah Java !!\"" <wahjava@gmail.com>, W3C HTML Mailing List <www-html@w3.org>
Philip TAYLOR schreef:
> it interprets the META directive as you would wish.  But in so
> doing, it starts to parse the document on the basis of it being
> expressed in ISO-9999-9, whereupon it discovers that there wasn't
> a META directive at all, there was, rather, a(n ill-formed) BODY
> tag. But because it now knows there /was/ no META directive, it
> parses using ISO-8859-1.  But that means there IS a META
> directive.  And so on.  I'm sure you see the problem ...

On the other hand you see that languages such as CSS use a similar 
mechanism to determine the character encoding:

   http://www.w3.org/TR/CSS21/syndata.html#x62

So it’s not without precedent.

Of course due to the constraints that CSS puts on the location and the 
encoding of the character encoding identifier, it’s a lot simpler to 
determine than in HTML.


~Grauw

-- 
Ushiko-san! Kimi wa doushite, Ushiko-san!!
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Laurens Holst, student, university of Utrecht, the Netherlands.
Website: www.grauw.nl. Backbase employee; www.backbase.com.



Received on Friday, 2 June 2006 21:09:53 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 27 March 2012 18:16:06 GMT