- From: <bugzilla@jessica.w3.org>
- Date: Tue, 21 Jan 2014 05:34:57 +0000
- To: www-international@w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=24337
Martin Dürst <duerst@it.aoyama.ac.jp> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |duerst@it.aoyama.ac.jp
--- Comment #1 from Martin Dürst <duerst@it.aoyama.ac.jp> ---
(In reply to Geoffrey Sneddon from comment #0)
> Currently the spec says: 'Authors must use the utf-8 encoding and must use
> the "utf-8" label to identify it.'
>
> Given the label matching is done case-insensitively, it is not entirely
> clear whether authors must use this label case-sensitively or not. This
> should be clarified, preferably to allow either case (there is no practical
> benefit of requiring it to be lowercased).
Agreed.
> We should also make the "utf8" label conforming. Making this non-conforming
> is of no practical benefit and makes a large number of documents
> non-conforming.
This looks innocuous at first. However, in some products (in particular Oracle
Databases), the label "utf8" is used for a variant of UTF-8 where characters
outside the BMP are encoded with two surrogates, with a total of 6 bytes. For
security reasons, this is prohibited in UTF-8.
--
You are receiving this mail because:
You are on the CC list for the bug.
Received on Tuesday, 21 January 2014 05:34:59 UTC