More WOFF issues from Richard Ishida on 2010-12-10 (public-i18n-core@w3.org from October to December 2010)

From: Richard Ishida <ishida@w3.org>
Date: Fri, 10 Dec 2010 13:24:25 -0000
To: "'Internationalization Core Working Group WG'" <public-i18n-core@w3.org>
Message-ID: <015601cb986d$91235460$b369fd20$@org>

While our issue number 2 is out there testing the system, here are some more issues I think we should raise. I'm raising here for comments from our WG, preparatory to sending out as formal comments. We need to send these asap, so please respond by email if you can, so that we can have a discussion prior to the next meeting.

All the following comments relate to 6. Extended Metadata Block http://www.w3.org/TR/WOFF/#Metadata

[a] Language tag references

" The possible values for the lang attribute can be found in the IANA Subtag Registry [Subtag]."

This implies that you can only use single subtags, since that is what the registry contains (with the exception of a few redundant and grandfathered tags.)

I think this should actually say:

"The possible values for the lang attribute MUST conform to BCP 47."

And there should be an entry for BCP 47 in the References section.

Similarly, the sentence

" A user agent displaying metadata is expected to choose a preferred language/locale to display from among those available, following RFC 4647 [RFC-4647]."

Would be better as

"A user agent displaying metadata is expected to choose a preferred language/locale to display from among those available, following matching algorithms in BCP 47 (currently RFC 4647)."

[b] Description of text elements

Until I looked at the example, it was not readily apparent to me how the text element fitted into the schema. I think you could make that clearer. In particular, I was expecting to find references to it in the list of elements in the last half of section 6.

[c] Use of attributes for human readable text

In the schema description, various items that contain human readable text are stored as attribute values. We normally recommend that you don't do this (see http://www.w3.org/TR/xml-i18n-bp/#DevAttributes) because of potential translation and annotation difficulties (eg. markup of bidi text). In several cases these attributes are the only content on empty elements.

See also the comment about localization of other elements, such as credit. Making the name attribute of the credit element into an element would allow for localizations of the name text, which are currently not possible.

We would suggest converting the attributes to element content. In most cases, this does not seem to cause any significant increase in the size of the markup.

[d] Localization mechanism too restricted

A font vendor such as Morisawa would probably want a Japanese audience to see its name in kanji, but present "Morisawa" to non-Japanese viewers. To enable this, the localised version access mechanism (use of the text element) should also apply to the content of the vendor element.

Likewise, a Tamil font designer would probably want their name in the credit element to be available in either Tamil or Latin scripts.

I'm therefore proposing that you extend the localization selection mechanism to vendor, credit and licensee elements (which would also reinforce the comment that proposes that the content of these elements be element content rather than attribute values).

I am assuming that this would not apply to the uniqueid element, by definition, even though markup authors may use non-ASCII text in the id itself.

[e] Paragraphs and inline content

Presumably, text in elements such as description and license can contain free flowing text organized into paragraphs. No markup is proposed for paragraph support; however nor is it clear from the spec that whitespace needs to be preserved for such content.

I would recommend that some minimal markup be provided for paragraphs and that this be supplemented with a span element. The paragraph and span markup would allow for the application of directional markup (see the comment about dir attribute) in this content. For example, to achieve correct display of a bidirectional title of a work on which a font is based, or to quote a paragraph in a language with a different base direction (quite possible in a text element with lang=ar inside a description element).

[f] Direction attributes needed

It should be possible to use markup to set the base direction of any element in order to enable correct display of bidirectional text. We suggest a dir attribute with the values rtl and ltr as a minimum. (Additional rlo and lro values may also be useful if it is felt that such things as lists of characters are likely to appear in the text and control is needed to override the Unicode Bidi Algorithm).

The base direction should apply to text in contained elements (so you could have dir="rtl" on a text element that is inherited by paragraph elements without need for extra markup.

In longer pieces of text, such as the description element, it is usually also useful to have a span element, to which a direction attribute can be attached if the base direction needs to be different from the surrounding context.

[g] OpenType feature preservation

Perhaps add some text to the note at the bottom of section 5 to say something like this:

"The automatic removal of opentype features such as GPOS and GSUB information at any stage in the process of deploying a WOFF file is strongly discouraged. Many writing systems around the world rely on these features for very basic display of text."

It is outside the scope of the WOFF spec, but I think having it mentioned here will be very useful in helping people avoid this trap.

============
Richard Ishida
Internationalization Lead
W3C (World Wide Web Consortium)

http://www.w3.org/International/
http://rishida.net/

Received on Friday, 10 December 2010 13:24:57 UTC