W3C home > Mailing lists > Public > whatwg@whatwg.org > August 2007

[whatwg] Serialising HTML to Files in Non-Unicode Encodings

From: Lachlan Hunt <lachlan.hunt@lachy.id.au>
Date: Thu, 16 Aug 2007 00:12:19 +1000
Message-ID: <46C309C3.3060808@lachy.id.au>
   There is a possible issue serialising HTML fragments section [1]. 
The algorithm seems fine for use with things like innerHTML, but there 
are other issues that should be considered when serialising to a file, 
database, network stream or something.

Such serialisers should consider the character encoding.  Although a 
Unicode encoding should ideally be used, some serialisers may need to 
serialise to a different encoding at the request of the user or 
limitations of the environment.  In such cases, the serialisation should 
output appropriate character references for characters that can't be 

It should also handle outputting the appropriate <meta charset=""> 
and/or BOM, especially in environments that can't declare it at the 
transport level like HTTP can.

Perhaps the spec should say something about this issue somehwhere.

[1] http://www.whatwg.org/specs/web-apps/current-work/#serialising

Lachlan Hunt
Received on Wednesday, 15 August 2007 07:12:19 UTC

This archive was generated by hypermail 2.3.1 : Monday, 13 April 2015 23:08:36 UTC