[whatwg/encoding] Why Big5 index contains unmappable characters? (Issue #293)

I try to generate all of characters which particular encoding supports to generate a test files for a [quick-xml](https://github.com/tafia/quick-xml). I found, that using [encoding_rs](https://github.com/hsivonen/encoding_rs) crate, some codepoints, declared in https://github.com/whatwg/encoding/blob/main/indexes.json for Big5 encoding actually represented as HTML references (`&#...;`). Digging into that I realized, that such output is generated when character is unmappable by the encoding.

So the question is: what the rationale to include in index characters that is unmappable by the encoding? I cannot find the answer on the https://encoding.spec.whatwg.org/. It has description of how to deal with that strange index, but does not explain why this index is so strange.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/encoding/issues/293
You are receiving this because you are subscribed to this thread.

Message ID: <whatwg/encoding/issues/293@github.com>

Received on Sunday, 21 August 2022 09:37:02 UTC