W3C home > Mailing lists > Public > public-i18n-geo@w3.org > November 2003

Re: BOM or Signature

From: Tex Texin <tex@i18nguy.com>
Date: Thu, 06 Nov 2003 00:51:57 -0500
Message-ID: <3FA9E17D.AFFC5003@i18nguy.com>
To: ishida@w3.org
Cc: public-i18n-geo@w3.org

Thanks for the pointers. 
My reading of these 2 sections is that the term signature refers to its
purpose, not as a name for the character.
The one exception is the table and related text in section 15.9. There they are
using the term signature (in my estimation) to refer to the pattern of bytes,
since in this usage they do not represent a character. ie A detector would be
looking for the byte patterns and not thinking in terms of characters or

But the section reminds of another concern, which is the BOM might actually be
Although this won't be the case for markup, to the extent more complex systems
might be concatenating data from different sources we perhaps should make sure
someone doesn't misinterpret the instructions as being a signal to remove BOM's
from the front of any type of file. (Unlikely I know, but necessary.)


Richard Ishida wrote:
> Tex,
> You were right that the term signature is not used in the CharMod. I
> was wrong about that.  But then 'byte order mark' and 'BOM' don't appear
> there either, as far as I can tell.
> But look at section 2.11 of the Unicode Standard v4 (pp47-48) and you
> will see a distinction drawn between the terms Byte Order Mark and
> Unicode Signature.  In section 15.9, it refers to 'Unicode Encoding Form
> Signatures'.
> RI
> ============
> Richard Ishida
> W3C
> contact info: http://www.w3.org/People/Ishida/
> http://www.w3.org/International/
> http://www.w3.org/International/geo/
> W3C Internationalization FAQs
> http://www.w3.org/International/questions.html
> RSS feed: http://www.w3.org/International/questions.rss

Tex Texin   cell: +1 781 789 1898   mailto:Tex@XenCraft.com
Xen Master                          http://www.i18nGuy.com
XenCraft		            http://www.XenCraft.com
Making e-Business Work Around the World
Received on Thursday, 6 November 2003 00:52:55 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:28:00 UTC