- From: Henri Sivonen <notifications@github.com>
- Date: Fri, 14 Oct 2016 03:18:19 -0700
- To: whatwg/encoding <encoding@noreply.github.com>
- Message-ID: <whatwg/encoding/issues/68/253762648@github.com>
@jungshik, I have some more questions: * Did you explore guessing the encoding from the top-level domain name instead of guessing it from content? * How many bytes do you feed to the detector? * What happens if the network stalls before that many bytes have been received? Do you wait or do you stop detecting based on a timer? (Timers have their own problems, and Firefox got rid of the timer in Firefox 4.) * Once the detector has seen as many bytes as you usually feed it, do you continue feeding data to the detector to allow it to revise its guess? If yes, what do you do if the detector revises its guess? * Other than ISO-2022-JP (per your previous comment) and, I'm extrapolating, EUC-JP, can the detector (as used in Blink) guess an encoding that wasn't previously the default for any Chrome localization? That is, does the detector make pages "work" when they previously didn't for any localization defaults? * Can the detector (as used in Blink) guess UTF-8? That is, can Blink's use of the detector inadvertently encourage the non-labeling of newly-created UTF-8 content? (Also, "is open-sourced and available in github" doesn't really answer @annevk's question about willingness to work towards standardizing.) -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/whatwg/encoding/issues/68#issuecomment-253762648
Received on Friday, 14 October 2016 10:18:47 UTC