- From: Tex Texin <tex@i18nguy.com>
- Date: Thu, 06 Nov 2003 05:30:53 -0500
- To: ishida@w3.org
- Cc: public-i18n-geo@w3.org
Richard Ishida wrote: > I have NTFS but Notepad is only able to detect that a file is UTF-8 > before opening with the Open dialog box if the signature is present. > Remove the signature and it no longer knows. So it seems to me that > NTFS doesn't remember the encoding. Same happens for html files with > encoding declaration. possibly. We were speculating it also depended on how the file was created. But as you say it was also doing some detecting based on heuristics, it is going to depend on the data in the file. Which means our studies with a handful of files are not conclusive. Apparently it also depends on how you open the file since you say above the box looks for the signature, but below for right click its always correct. Odd they don't use the same detection for both. At least we are together on the conclusion! ;-) > > > And if they use right-click, they won't even get the choice. > > Note that Notepad seems to apply some heuristics to the file when you > right click. It always opens correctly as utf-8. > > > The solution to the FAQ is just to include a sentence or two > > indicating that if you are going to remove a BOM, you should > > know how the file is used and verify whether it will have an > > impact, (or remove it and monitor that subsequent processing > > doesn't break). > > That sounds fair enough. > > RI -- ------------------------------------------------------------- Tex Texin cell: +1 781 789 1898 mailto:Tex@XenCraft.com Xen Master http://www.i18nGuy.com XenCraft http://www.XenCraft.com Making e-Business Work Around the World -------------------------------------------------------------
Received on Thursday, 6 November 2003 05:31:54 UTC