Re: [whatwg/encoding] Explain the relationship between windows-1252, Latin1, and ASCII (PR #345)

@domenic commented on this pull request.



> @@ -732,6 +747,30 @@ part of the ISO 8859 series. In particular, the necessity of the inclusion of <a
 and <a>ISO-8859-16</a> is doubtful for the purpose of supporting existing content, but there are no
 plans to remove these.</p>
 
+<div class=note id=note-latin1-ascii>
+ <p>The <a>windows-1252</a> <a for=/>encoding</a> has various <a for=encoding>labels</a> like
+ "<code>latin1</code>", "<code>iso-8859-1</code>", "<code>ascii</code>", etc. which have
+ historically been confusing for developers. On the web, and in any software that seeks to be
+ web-compatible by implementing the Encoding Standard, these are synonyms: "<code>latin1</code>" and
+ "<code>ascii</code>" are just labels for <a>windows-1252</a>, and any software following this
+ standard will, for example, decode 0x80 as U+20AC (€) when asked for the Latin1 or ASCII decoding
+ of that byte.

I tried to phrase this carefully to avoid giving the impression that latin1 or ASCII are encodings, and instead be clear that they are inputs to the common algorithm category that takes (byte sequence, encoding label) parameters.

On the web that algorithm category is well-formalized with the concepts of actual encodings vs. labels, but in larger software it's more vague with e.g. functions named `DecodeLatin1` or similar.

My attempt was "when asked for the Latin1 or ASCII decoding of that byte", but if you have a different suggestion I'd be interested. The main thing is that I don't want to only constrain us to describing the web API case where we have a clear label/encoding divide, but instead the more general category of "please decode some bytes" algorithms across all software.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/encoding/pull/345#discussion_r2040542556
You are receiving this because you are subscribed to this thread.

Message ID: <whatwg/encoding/pull/345/review/2762018229@github.com>

Received on Saturday, 12 April 2025 03:56:00 UTC