- From: Křištof Želechovski <giecrilj@stegny.2a.pl>
- Date: Wed, 15 Aug 2007 17:56:10 +0200
Any serializer that needs an exotic character set should also have a way of retrieving the character set. This character set need not be specified within the fragment stored because it would probably be the same for many fragments. The serializer should rather store it elsewhere without modifying the original HTML text: as a record attribute, a column property, an optional parameter or built-in configuration parameter. Chris -----Original Message----- From: whatwg-bounces@lists.whatwg.org [mailto:whatwg-bounces at lists.whatwg.org] On Behalf Of Lachlan Hunt Sent: Wednesday, August 15, 2007 4:12 PM To: whatwg Subject: [whatwg] Serialising HTML to Files in Non-Unicode Encodings Hi, There is a possible issue serialising HTML fragments section [1]. The algorithm seems fine for use with things like innerHTML, but there are other issues that should be considered when serialising to a file, database, network stream or something. Such serialisers should consider the character encoding. Although a Unicode encoding should ideally be used, some serialisers may need to serialise to a different encoding at the request of the user or limitations of the environment. In such cases, the serialisation should output appropriate character references for characters that can't be represented. It should also handle outputting the appropriate <meta charset=""> and/or BOM, especially in environments that can't declare it at the transport level like HTTP can. Perhaps the spec should say something about this issue somehwhere. [1] http://www.whatwg.org/specs/web-apps/current-work/#serialising -- Lachlan Hunt http://lachy.id.au/
Received on Wednesday, 15 August 2007 08:56:10 UTC