- From: Anne van Kesteren <annevk@opera.com>
- Date: Wed, 18 Apr 2012 19:12:46 +0200
On Wed, 18 Apr 2012 15:40:33 +0200, Glenn Maynard <glenn at zewt.org> wrote: > "This is a decoder error" seems odd; it's descriptive language ("this > thing must be made true") rather than declarative ("do this thing"). > I'd suggest the declarative language "Emit a decoder error" and "Emit an > encoder error". Yes. Awesome suggestion implemented. > "If code point is equal or greater than lower boundary" is more naturally > "greater than or equal to" (and "less than or equal to"). That said, > this would be much clearer with interval syntax: > > "If code point is in the range [*lower boundary*, 0x10FFFF] and is not in > the range [0xD800, 0xDFFF], emit code point (and continue)." > > which I think is easier to read, and also makes it clear that the "0xD800 > to 0xDFFF" is a closed interval (0xD800 and 0xDFFF are included). Then we'd first have to introduce interval syntax to the English language. We could do that I suppose in the Terminology section if you think it would be better. >> An encoder contains one or more encoder error points. Unless stated >> otherwise the encoder is terminated at that point. > > Encoding form data, at least, doesn't abort on the first error; any > unrepresentable codepoints are encoded as as &x1234;. (It would sure be > nice if encoding to non-Unicode-based encodings would just *always* use > that syntax for non-ASCII, so the encoders could be dropped, but I guess > that would trigger bugs in pages that are currently masked...) Is there > any encoding path in browsers that does give up on the first error? It has been proposed for the API. And in URLs you do not get "&#...;" (though in WebKit you do) but you get "?" (IE at the network layer, Opera earlier on) or the utf-8 representation (Gecko is totally weird). Maybe we should align URLs with <form> here and use "&#...;" throughout if that is compatible with content. Probably deserves a a discussion in its own thread. I do not know any cases beyond URLs, <form>, and the proposed API that require an encoder in the platform. -- Anne van Kesteren http://annevankesteren.nl/
Received on Wednesday, 18 April 2012 10:12:46 UTC