Important note on default charset in "text/csv" (RFC 4180)

I noticed the github issue Jeni posted earlier (#44) as well as issue #8 in
the model document (
http://www.w3.org/TR/2014/WD-tabular-data-model-20140327/). The issue is
that we want the default character set to be UTF-8 while RFC4180 when I
wrote it defines it as plain ASCII.

While going through the documents, it turns out that the default character
set for "text/csv" is actually now UTF-8. This change took effect when
RFC7111 which defines CSV fragments was approved. The CSV mime type now
consists of RFC 4180, RFC 7111 with a combined registration appearing here:

https://www.iana.org/assignments/media-types/text/csv

This means that while RFC 4180 does mandate ASCII, for standards purposes
on the IETF side, this has been changed and the default now is in fact
UTF-8.

Thanks,
Yakov

Received on Thursday, 30 October 2014 01:17:12 UTC