Re: For review: The byte-order mark (BOM) in HTML from Asmus Freytag on 2012-12-20 (www-international@w3.org from October to December 2012)

From: Asmus Freytag <asmusf@ix.netcom.com>
Date: Thu, 20 Dec 2012 08:07:09 -0800
To: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
CC: Richard Ishida <ishida@w3.org>, www International <www-international@w3.org>
Message-ID: <50D337AD.3070700@ix.netcom.com>

On 12/20/2012 7:42 AM, Leif Halvard Silli wrote:
> Asmus Freytag, Thu, 20 Dec 2012 07:25:49 -0800:
>> On 12/20/2012 3:53 AM, Leif Halvard Silli wrote:
>>> How about being consistent about writing
>>>
>>>  byte order mark
>>>
>>> and not
>>>
>>>  byte-order mark
>>>
>>> since the former is the official form?
>> And not to forget,  BYTE ORDER MARK, when indicating the formal identifier.
> So then, let me rephrase my proposal: How about using the formal
> identifier rather than 'byte-order mark'?
>
> PS: I personally like the hyphened version. So this proposal is only
> based on a wish to promote the official codification.
Leif,

The rules for character name matching officially ignore the hyphen (that 
is, you will never see a formal name "BYTE-ORDER MARK" that designates 
something that different from the BYTE ORDER MARK). Therefore, I see no 
problem with using U+FEFF BYTE ORDER MARK to formally announce the 
character and  "byte-order mark" in lower case, in the running text, 
using ordinary English rules of hyphenation.

The same rules also ignore case, but it is conventional to present the 
formal identifier, if that is intended, in all uppercase. (Just as it is 
conventional to use uppercase hex numbers and 4-6 digits with a U+ for 
the character code to mark a Unicode code point in running text)

I would definitely recommend against using BYTE-ORDER MARK anywhere, 
because that might mislead people into which form of this identifier is 
the published form.

A./

Received on Thursday, 20 December 2012 16:07:56 UTC