W3C home > Mailing lists > Public > public-html-bugzilla@w3.org > October 2009

[Bug 7461] List of space characters should include U+000B LINE TABULATION (VT) or should note why it is not included.

From: <bugzilla@wiggum.w3.org>
Date: Sat, 17 Oct 2009 02:16:45 +0000
To: public-html-bugzilla@w3.org
Message-Id: <E1Myyqb-0001WA-Mh@wiggum.w3.org>
http://www.w3.org/Bugs/Public/show_bug.cgi?id=7461





--- Comment #2 from mdmkolbe@yahoo.com  2009-10-17 02:16:45 ---
If this U+000B LINE TABULATION (a.k.a. Vertical Tab, VT) where included, then
this list of characters would be precisely those which (both the following
definitions are equivalent):
 - have the Unicode White_Space property and are in Basic Latin (i.e. 7-bit
ASCII)
 - are "standard white-space characters" (i.e. those for which isspace()
returns true in the "C" locale) in the C99 standard [1].

To a naive reader (like me), it is surprising that VT has been excluded.  After
all the other "strange" ASCII white-space character, U+000C FORM FEED (FF), is
included.  It seems rather arbitrary to include FF but exclude VT.  Is there
some technical distinction that I'm missing that would explain why one is
included but the other excluded?

I acknowledge that HTML 4.01 also allowed FF but excluded VT [2].  However, I
haven't been able to find any documentation explaining why.  On the face it
looks like it could have been an oversight or a hold-over from SGML.

[1] http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1336.pdf page 183
[2] http://unicode.org/reports/tr20/#White


-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
Received on Saturday, 17 October 2009 02:16:47 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Saturday, 17 October 2009 02:16:49 GMT