Re: Document without charset

Jukka K. Korpela wrote:

> Apparently the validator uses UTF-8 as the implied default.
> 
> The choice is impractical

I can't recall RFC number from the top of my head, but HTTP protocol 
assumes ISO-8859-1 for all text/* media types as a default. So it is no 
"impractical", it is clearly bug.

That's why text/xml was superseded by application/xml where is no such 
default assumed. If there were no charset parameter, ISO-8859-1 should 
be assumed from HTTP point of view, but XML document without XML 
declaration assumes UTF-8 or UTF-16. However HTTP takes precedence and 
you are decoding XML content with a wrong encoding assumption. Not good. 
It sounds silly to serve XML with other content type then text/*, but 
legacy is legacy :-(

				Jirka

-- 
------------------------------------------------------------------
   Jirka Kosek     e-mail: jirka@kosek.cz     http://www.kosek.cz
------------------------------------------------------------------
   Profesionální školení a poradenství v oblasti technologií XML.
      Podívejte se na náš nově spuštěný web http://DocBook.cz
        Podrobný přehled školení http://xmlguru.cz/skoleni/
------------------------------------------------------------------
                    Nejbližší termíny školení:
      ** XSLT 13.-16.3.2006 ** XML schémata 24.-26.4.2006 **
        ** DocBook 15.-17.5.2006 ** XSL-FO 12.-13.6.2006 **
------------------------------------------------------------------

Received on Thursday, 8 December 2005 22:13:34 UTC