RE: Opening files in Notepad with/without UTF-8 signature

> With respect to Notepad, one of the considerations we 
> discussed is that Windows has 3 file systems. One of them, 
> NTFS, maintains additional information about files. We 
> speculated it recorded encoding. So, if notepad made use of 
> that, the BOM might be less necessary in that environment. 
> (Which I suspect is also your environment, as NTFS is the 
> default for hard disks.)  

I have NTFS but Notepad is only able to detect that a file is UTF-8
before opening with the Open dialog box if the signature is present.
Remove the signature and it no longer knows.  So it seems to me that
NTFS doesn't remember the encoding.  Same happens for html files with
encoding declaration. 

> And if they use right-click, they won't even get the choice.

Note that Notepad seems to apply some heuristics to the file when you
right click.  It always opens correctly as utf-8.


> The solution to the FAQ is just to include a sentence or two 
> indicating that if you are going to remove a BOM, you should 
> know how the file is used and verify whether it will have an 
> impact, (or remove it and monitor that subsequent processing 
> doesn't break).

That sounds fair enough.

RI

Received on Thursday, 6 November 2003 02:44:40 UTC