- From: Jonathan Kew <jfkthame@gmail.com>
- Date: Sun, 29 Jun 2014 08:11:02 +0100
- To: Brad Kemper <brad.kemper@gmail.com>
- CC: "Tab Atkins Jr." <jackalmage@gmail.com>, Koji Ishii <kojiishi@gluesoft.co.jp>, Anne van Kesteren <annevk@annevk.nl>, Zack Weinberg <zackw@panix.com>, fantasai <fantasai.lists@inkedblade.net>, www-style list <www-style@w3.org>
On 29/6/14 05:33, Brad Kemper wrote: > > On Jun 27, 2014, at 1:57 PM, Jonathan Kew <jfkthame@gmail.com> > wrote: > >> That's not necessarily true. ZWNJ (and a number of other >> normally-invisible characters) are defined to be "default >> ignorable", so processes such as searching that base their behavior >> on Unicode character properties should be able to ignore them >> appropriately. >> >> [...] > >> But the C0/C1 control characters - apart from a few exceptions like >> newline - do not have any legitimate use as part of text on the >> web; their defined control functions such as <start of text> or >> <end of transmission block> are provided by entirely different >> levels of the platform. > > Then why not have the control characters ignored when searching for > text too? They don't have the default-ignorable property. Now, I suppose we could specify (somewhere - though I don't see how this would fall within the scope of CSS) that text processes such as searching, sorting, indexing, etc., within the web platform should base their behavior *not* on the (normative) Unicode character properties, but on something else that we specify independently. But IMO this would be a *REALLY* bad idea. There's a standard; we should follow it. This isn't just about behavior within the web platform, but also consistency and interoperability with text processing in other environments. The more closely we all keep to the relevant standards, the better for everyone. JK
Received on Sunday, 29 June 2014 07:11:24 UTC