W3C home > Mailing lists > Public > whatwg@whatwg.org > May 2012

[whatwg] Editorial: ASCII case-insensitive string comparison

From: Řistein E. Andersen <liszt@coq.no>
Date: Sat, 12 May 2012 13:47:02 +0100
To: whatwg@whatwg.org
Message-Id: <CC2C2532-FA1E-45BA-91F2-153FEFA2CBFB@coq.no>
When I read Anne van Kesteren's Encoding specification recently, I came across the following definition, borrowed from HTML5:

> Comparing two strings in an ASCII case-insensitive manner means comparing them exactly, code point for code point, except that the characters in the range U+0041 to U+005A (i.e. LATIN CAPITAL LETTER A to LATIN CAPITAL LETTER Z) and the corresponding characters in the range U+0061 to U+007A (i.e. LATIN SMALL LETTER A to LATIN SMALL LETTER Z) are considered to also match.


The construction ‘are considered to also match’ seems awkward here since the intended meaning is clearly not that the characters match in addition to doing something else like in ‘I don’t just want you to laugh but to also sing along’ or ‘our face/tongue system allow[s] us to talk and eat—but also to sing and act’.

The most natural place for ‘also’ is probably in front of ‘considered’ (yielding ‘are also considered to match’).

(Another solution would be to remove the need for ‘also’ by rewriting the phrase, for instance to something like ‘except that the characters in the range U+0041 to U+005A ([...] A to [...] Z) are considered equivalent to the corresponding characters in the range U+0061 to U+007A ([... a] to [... z])’.)

Řistein E. Andersen
Received on Saturday, 12 May 2012 12:47:44 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 30 January 2013 18:48:08 GMT