validator/htdocs/config charset.cfg,1.14,1.15

Update of /sources/public/validator/htdocs/config
In directory hutz:/tmp/cvs-serv27307/htdocs/config

Modified Files:
	charset.cfg 
Log Message:
re-populating the list of charsets (from older revisions in both branch 0.7 and 0.8)
to be used thus:
* no fatal error if the charset is supported by encode
* a warning with the suggestion for a better alias if we know one
* a warning that the encoding may be "odd" if not in the list but encode says it's OK



Index: charset.cfg
===================================================================
RCS file: /sources/public/validator/htdocs/config/charset.cfg,v
retrieving revision 1.14
retrieving revision 1.15
diff -u -d -r1.14 -r1.15
--- charset.cfg	19 Jul 2007 03:59:23 -0000	1.14
+++ charset.cfg	19 Jul 2007 08:18:30 -0000	1.15
@@ -1,23 +1,78 @@
 #
-# List of encodings aliases and forbidden encodings
+# list of accepted/preferred character encodings 
 #
 # $Id$
+#
+# Syntax:
+#
+# charset/encoding      = ? result
+#
+# Note: charsets and results are lowercase, actions are uppercase
+#
+# ? indicates the action to take:
+# 1: OK, character supported
+# X: frequent error, e.g. starting with x-; ask user to replace with result
+# ERR: a charset we refuse, per some policy. Reason stated after ERR
 
-# This list indicates character encoding aliases that are 
-# not recommended, along with a recommended equivalent, e.g:
-# encoding-obscure = encoding-well-known
-
-# It also lists encoding names that the validator will refuse to treat:
-# bogus_encoding = Encoding Forbidden (Reason why)
-
-# The list is independent of what
-# is supported on a specific system but subject to the Validator
-# policy for acceptable encodings.
+#e.g:
+# utf-8         = 1
+# odd-alias     = X good-alias
+# bad_charset   = ERR explain reason
 
+utf-8                           = 1
+utf-16                          = 1
+utf-16be                        = 1
+utf-16le                        = 1
+iso-8859-1                      = 1
+iso-8859-2                      = 1
+iso-8859-3                      = 1
+iso-8859-4                      = 1
+iso-8859-5                      = 1
+iso-8859-6                      = 1
+# implicit bidi, but character encoding is the same
+iso-8859-6-i                    = 1
+iso-8859-7                      = 1
+iso-8859-8                      = 1
+# implicit bidi, but character encoding is the same
+iso-8859-8-i                    = 1
+iso-8859-9                      = 1
+iso-8859-10                     = 1
+iso-8859-11                     = 1
+# iso-8859-12 doesn't exist (yet?)
+iso-8859-13                     = 1
+iso-8859-14                     = 1
+iso-8859-15                     = 1
+iso-8859-16                     = 1
+us-ascii                        = 1
+iso-2022-jp                     = 1
+shift_jis                       = 1
+euc-jp                          = 1
+gb2312                          = 1
+big5                            = 1
+iso-2022-kr                     = 1
+euc-kr                          = 1
+gb18030                         = 1
+tis-620                         = 1
+koi8-r                          = 1
+koi8-u                          = 1
+iso-ir-111                      = 1
+windows-1250                    = 1
+windows-1251                    = 1
+windows-1252                    = 1
+windows-1253                    = 1
+windows-1254                    = 1
+windows-1255                    = 1
+windows-1256                    = 1
+windows-1257                    = 1
+# windows-1258                  = 1
+macintosh                       = 1
+ks_c_5601-1987                  = 1
+ksc_5601                        = 1
 
-x-mac-roman		= macintosh
-x-sjis			= shift_jis
-iso8859-1		= iso-8859-1
-ascii			= us-ascii
+x-mac-roman                     = X macintosh
+x-sjis                          = X shift_jis
+iso8859-1                       = X iso-8859-1
+ascii                           = X us-ascii
+8859_1                          = X iso-8859-1
 # this one is in IANA, but better use only windows-1252
-iso-8859-1-Windows-3.1-Latin-1	= windows-1252
+iso-8859-1-Windows-3.1-Latin-1  = X windows-1252

Received on Thursday, 19 July 2007 08:18:39 UTC