I18N-ISSUE-495: note about windows-1252 is invalid ??? [find-text]

I18N-ISSUE-495: note about windows-1252 is invalid ⓟ [find-text]


Raised by: Addison Phillips
On product: find-text


In the introduction we find this note:

This specification defines the behavior for documents using a Unicode character encoding, such as UTF-8. Behavior for documents using legacy character encoding, such as windows-1252, may be anomolous. 

Since the document processing model for Web pages and other parts of the Open Web stack is based entirely on Unicode, the character encoding used to transmit or serialize a page being searched is not germane to finding text. 

How the document is converted to Unicode may matter: CharMod recommends that a "normalizing transcoder" be used. However, the specification is not about searching byte streams. It is about searching the converted Unicode character stream. There will be no anomalous search behavior unless something is very wrong with the APIs in this document. This note invites developers and implementers to question something that they really shouldn't be concerned about.

(editorial nit: anomalous is misspelled)  

Received on Thursday, 15 October 2015 21:58:37 UTC