- From: Internationalization Working Group Issue Tracker <sysbot+tracker@w3.org>
- Date: Mon, 30 Mar 2015 13:31:57 +0000
- To: public-i18n-core@w3.org
I18N-ISSUE-439 (BUG24337): Authors should be able to use both "utf8" and "utf-8" labels, case-insensitively [encoding] http://www.w3.org/International/track/issues/439 Raised by: Richard Ishida On product: encoding https://www.w3.org/Bugs/Public/show_bug.cgi?id=24337 This issue tracks the bug listed above and was created as part of the WG CR process. --- (In reply to Geoffrey Sneddon from comment #0) > Currently the spec says: 'Authors must use the utf-8 encoding and must use > the "utf-8" label to identify it.' > > Given the label matching is done case-insensitively, it is not entirely > clear whether authors must use this label case-sensitively or not. This > should be clarified, preferably to allow either case (there is no practical > benefit of requiring it to be lowercased). Agreed. > We should also make the "utf8" label conforming. Making this non-conforming > is of no practical benefit and makes a large number of documents > non-conforming. This looks innocuous at first. However, in some products (in particular Oracle Databases), the label "utf8" is used for a variant of UTF-8 where characters outside the BMP are encoded with two surrogates, with a total of 6 bytes. For security reasons, this is prohibited in UTF-8.
Received on Monday, 30 March 2015 13:31:59 UTC