Re: [css-text] Control characters from Koji Ishii on 2014-06-29 (www-style@w3.org from June 2014)

From: Koji Ishii <kojiishi@gluesoft.co.jp>
Date: Sun, 29 Jun 2014 11:51:35 +0000
To: Jonathan Kew <jfkthame@gmail.com>
CC: Brad Kemper <brad.kemper@gmail.com>, Tab Atkins Jr. <jackalmage@gmail.com>, Anne van Kesteren <annevk@annevk.nl>, Zack Weinberg <zackw@panix.com>, fantasai <fantasai.lists@inkedblade.net>, www-style list <www-style@w3.org>
Message-ID: <3BCCEDFD-A540-45DF-9365-641CFBEB5AF4@gluesoft.co.jp>

> They don't have the default-ignorable property.

Interesting. From the data, they were default-ignorable until Unicode 5.0, and then Unicode removed them in 5.1. I guess we need to learn motivations why Unicode did so if we were going to spend more efforts in this topic.

> Now, I suppose we could specify (somewhere - though I don't see how this would fall within the scope of CSS) that text processes such as searching, sorting, indexing, etc., within the web platform should base their behavior *not* on the (normative) Unicode character properties, but on something else that we specify independently. But IMO this would be a *REALLY* bad idea. There's a standard; we should follow it.
> 
> This isn't just about behavior within the web platform, but also consistency and interoperability with text processing in other environments. The more closely we all keep to the relevant standards, the better for everyone.

First of all, CSS defines surrendering, so searching, sorting, indexing, etc. are out of scope. Second. As far as I understand, there’s nothing in Unicode stating higher-level protocols should render control characters, though it may also not recommend not to. In that case, we’re not violating the normative Unicode character properties at all.

By the way, my personal +1 is to Brad. Seaching and copying text in browsers sometimes bother me too, so I share your concern, but improving search is the appropriate way to address what you want to solve than to discuss about rendering of control characters. W3C does not have such spec today, but you could suggest W3C to create a spec for text-izing HTML content which should help interoperable behavior for searching, sorting, indexing, etc.

/koji

Received on Sunday, 29 June 2014 11:52:09 UTC