Michael Kay scripsit: > You could do it without changing the definition of well-formedness by > saying that the set of characters considered to be whitespace, and > normalized as such, is a property of the encoding. Fine and dandy for EBCDIC, but not so good for Latin-1 as used on mainframes, where 0x85 = NEL. Actually, not so good for EBCDIC either, because it means that each of the dozens of EBCDIC code pages has to exist in two flavors, a native flavor where 0x15 encodes U+0085, and an XML flavor where 0x15 encodes U+000A. This is more or less what FTP software has to do, and it's ugly. -- John Cowan <jcowan@reutershealth.com> http://www.reutershealth.com http://www.ccil.org/~cowan .e'osai ko sarji la lojban. Please support Lojban! http://www.lojban.orgReceived on Friday, 26 July 2002 11:35:57 GMT
This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 29 October 2007 16:58:05 GMT