Re: PE134

Richard Tobin a écrit :
>>It turns out that knowing the encoding family is 
>>sufficient to reliably recognize U+0020 SPACE as well as most ASCII 
>>characters
> 
> What is the significance of "most" here?  If you know the encoding is
> an ASCII superset, you can recognize all ASCII characters.

But if you detect an EBCDIC-family encoding, you know the positions of 
only "most" ASCII characters, since the common subset of EBCDIC code 
pages is not exactly the same as ASCII.  That subset, however, is 
sufficient to analyse the XML declaration, find the encoding decl. 
within and learn the precise EBCDIC page you have.

-- 
François

Received on Wednesday, 20 October 2004 22:31:55 UTC