- From: Kang-Hao (Kenny) Lu <kennyluck@csail.mit.edu>
- Date: Thu, 12 Jan 2012 23:57:58 +0800
- To: WWW Style <www-style@w3.org>
- CC: Jonathan Kew <jonathan@jfkew.plus.com>
(12/01/12 22:00), Jonathan Kew wrote: > What if an unpaired UTF-16 surrogate codepoint is found? ("Proceed as usual", I suppose. What do the various browsers do with this currently?) IE9, Safari5.1, Chromium18, Opera12alpha "proceed as usual" in my test case[1], although for unknown reasons Opera12alpha doesn't display the character in the first line. > My preference would be to explicitly disallow character escapes in the range \d800..\dfff. That seems cleaner, simpler to understand, and easier to implement than special-casing pairs of <high surrogate, low surrogate> and then deciding how to deal with unpaired surrogates sensibly. My preference is to turn them all into U+FFFD to match how HTML5 parser treats �... [2] (Not sure if disallowing characters means browers drop the whole line) I don't have strong opinion on this though. [1] http://lists.w3.org/Archives/Public/www-archive/2012Jan/att-0005/surrogates-in-css [2] http://www.whatwg.org/specs/web-apps/current-work/multipage/tokenization.html#tokenizing-character-references Cheers, Kenny
Received on Thursday, 12 January 2012 16:03:20 UTC