W3C home > Mailing lists > Public > www-font@w3.org > April to June 2011

Re: [www-font] WOFF metadata - should we require (rather than recommend) the use of UTF-8?

From: Martin J. Dürst <duerst@it.aoyama.ac.jp>
Date: Wed, 01 Jun 2011 14:59:16 +0900
Message-ID: <4DE5D534.3080306@it.aoyama.ac.jp>
To: robert@ocallahan.org
CC: mpsuzuki@hiroshima-u.ac.jp, jonathan@jfkew.plus.com, public-webfonts-wg@w3.org, www-font@w3.org
Just for the record, some comments below.

On 2011/06/01 13:13, Robert O'Callahan wrote:
> On Wed, Jun 1, 2011 at 4:09 PM, Robert O'Callahan<robert@ocallahan.org>wrote:
>
>> On Wed, Jun 1, 2011 at 4:03 PM,<mpsuzuki@hiroshima-u.ac.jp>  wrote:
>>
>>> "Anything" is too broad to understand... Excuse me,
>>> could you give me a concrete example of the system
>>> or usecase that consumes WOFF but has some difficulty
>>> to handle an XML in UTF-16?

The requirement to support UTF-16 in XML in addition to UTF-8 was added 
because there was a concern that otherwise, e.g. Japanese data might 
expand considerably. That concern turned out to be mostly non-justified, 
because in the arbitrary XML, there is a fair percentage of ASCII 
characters. The compatibility with US-ASCII has led to UTF-8 being way, 
way more popular on the Web than UTF-16. Various XML applications as 
well as non-XML formats have switched to UTF-8 only. The reduction from 
two encodings to one is very significant for interoperability, to the 
extent that in the networking/protocol area, there is a saying 
"zero-one-many".

>> Jonathan already gave an example in his first message.
>
>
> Hmm, maybe that example wasn't clear enough.
>
> What Jonathan is actually doing is creating a Javascript API that returns a
> string containing the WOFF metadata. So that code isn't going to be parsing
> the XML, but it does need to know the encoding so the text can be correctly
> converted to a Javascript string.

That is a good example, with one twist: Strings in Javascript happen to 
be UTF-16. But that's all under the hood, nothing to worry about.


> Any consumer that needs to convert the WOFF metadata to some kind of string
> (and isn't immediately parsing the XML) will have the same problem.

Yes. Saying UTF-8 and only UTF-8 is a good solution.


Regards,   Martin.
Received on Wednesday, 1 June 2011 06:00:04 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Saturday, 11 June 2011 00:14:11 GMT