Re: More WOFF issues from Mark Davis ☕ on 2010-12-10 (public-i18n-core@w3.org from October to December 2010)

From: Mark Davis ☕ <mark@macchiato.com>
Date: Fri, 10 Dec 2010 09:18:37 -0800
To: Richard Ishida <ishida@w3.org>
Cc: Addison Phillips <addison@lab126.com>, Internationalization Core Working Group WG <public-i18n-core@w3.org>
Message-ID: <AANLkTi=rn-fGUoCmFAKVOE6++-Cse=dJWLqKaGE4PVqc@mail.gmail.com>
I didn't read the woff spec, but one item in your comments needs to be
fixed:

> "A user agent displaying metadata is expected to choose a preferred
language/locale to display from among those available, following matching
algorithms in BCP 47 (currently RFC 4647)."

The matching algorithms in BCP 47 are *only* really examples of what could
be done in the absence of better algorithms, and are not the optimal
matching algorithms to use in practice. (CLDR, for example, has better
matching algorithms.)

So the text should be:

=>

"A user agent displaying metadata is expected to choose a preferred
language/locale to display from among those available, following matching
algorithms such as those specified in BCP 47 (currently RFC 4647)."

Mark

*— Il meglio è l’inimico del bene —*


On Fri, Dec 10, 2010 at 05:40, Richard Ishida <ishida@w3.org> wrote:

> Given that we are close to the end of the last call window, should I just
> raise these comments as issues and send to the woff folks, and we can modify
> them later via email?  Or should we ask for an extension?
>
> RI
>
> ============
> Richard Ishida
> Internationalization Lead
> W3C (World Wide Web Consortium)
>
> http://www.w3.org/International/
> http://rishida.net/
>
>
>
>
> > -----Original Message-----
> > From: public-i18n-core-request@w3.org [mailto:public-i18n-core-
> > request@w3.org] On Behalf Of Richard Ishida
> > Sent: 10 December 2010 13:24
> > To: 'Internationalization Core Working Group WG'
> > Subject: More WOFF issues
> > Importance: High
> >
> > While our issue number 2 is out there testing the system, here are some
> > more issues I think we should raise. I'm raising here for comments from
> our
> > WG, preparatory to sending out as formal comments.  We need to send
> > these asap, so please respond by email if you can, so that we can have a
> > discussion prior to the next meeting.
> >
> >
> > All the following comments relate to 6. Extended Metadata Block
> > http://www.w3.org/TR/WOFF/#Metadata
> >
> > [a] Language tag references
> >
> > " The possible values for the lang attribute can be found in the IANA
> Subtag
> > Registry [Subtag]."
> >
> > This implies that you can only use single subtags, since that is what the
> > registry contains (with the exception of a few redundant and
> grandfathered
> > tags.)
> >
> > I think this should actually say:
> >
> > "The possible values for the lang attribute MUST conform to BCP 47."
> >
> > And there should be an entry for BCP 47 in the  References section.
> >
> > Similarly, the sentence
> >
> > " A user agent displaying metadata is expected to choose a preferred
> > language/locale to display from among those available, following RFC 4647
> > [RFC-4647]."
> >
> > Would be better as
> >
> > "A user agent displaying metadata is expected to choose a preferred
> > language/locale to display from among those available, following matching
> > algorithms in BCP 47 (currently RFC 4647)."
> >
> >
> >
> >
> > [b] Description of text elements
> >
> > Until I looked at the example, it was not readily apparent to me how the
> text
> > element fitted into the schema.  I think you could make that clearer.  In
> > particular, I was expecting to find references to it in the list of
> elements in the
> > last half of section 6.
> >
> >
> >
> >
> > [c] Use of attributes for human readable text
> >
> > In the schema description, various items that contain human readable text
> are
> > stored as attribute values.  We normally recommend that you don't do this
> > (see http://www.w3.org/TR/xml-i18n-bp/#DevAttributes) because of
> potential
> > translation and annotation difficulties (eg. markup of bidi text).  In
> several
> > cases these attributes are the only content on empty elements.
> >
> > See also the comment about localization of other elements, such as
> credit.
> > Making the name attribute of the credit element into an element would
> allow
> > for localizations of the name text, which are currently not possible.
> >
> > We would suggest converting the attributes to element content. In most
> cases,
> > this does not seem to cause any significant increase in the size of the
> markup.
> >
> >
> >
> >
> > [d] Localization mechanism too restricted
> >
> > A font vendor such as Morisawa would probably want a Japanese audience
> > to see its name in kanji, but present "Morisawa" to non-Japanese viewers.
>  To
> > enable this, the localised version access mechanism (use of the text
> element)
> > should also apply to the content of the vendor element.
> >
> > Likewise, a Tamil font designer would probably want their name in the
> credit
> > element to be available in either Tamil or Latin scripts.
> >
> > I'm therefore proposing that you extend the localization selection
> mechanism
> > to vendor, credit and licensee elements (which would also reinforce the
> > comment that proposes that the content of these elements be element
> > content rather than attribute values).
> >
> >
> > I am assuming that this would not apply to the uniqueid element, by
> definition,
> > even though markup authors may use non-ASCII text in the id itself.
> >
> >
> >
> > [e] Paragraphs and inline content
> >
> > Presumably, text in elements such as description and license can contain
> free
> > flowing text organized into paragraphs. No markup is proposed for
> > paragraph support; however nor is it clear from the spec that whitespace
> > needs to be preserved for such content.
> >
> > I would recommend that some minimal markup be provided for paragraphs
> > and that this be supplemented with a span element.  The paragraph and
> > span markup would allow for the application of directional markup (see
> the
> > comment about dir attribute) in this content.  For example, to achieve
> correct
> > display of a bidirectional title of a work on which a font is based, or
> to quote
> > a paragraph in a language with a different base direction (quite possible
> in a
> > text element with lang=ar inside a description element).
> >
> >
> >
> >
> >
> > [f] Direction attributes needed
> >
> > It should be possible to use markup to set the base direction of any
> element
> > in order to enable correct display of bidirectional text.  We suggest a
> dir
> > attribute with the values rtl and ltr as a minimum.  (Additional rlo and
> lro
> > values may also be useful if it is felt that such things as lists of
> characters are
> > likely to appear in the text and control is needed to override the
> Unicode Bidi
> > Algorithm).
> >
> > The base direction should apply to text in contained elements (so you
> could
> > have dir="rtl" on a text element that is inherited by paragraph elements
> > without need for extra markup.
> >
> > In longer pieces of text, such as the description element, it is usually
> also
> > useful to have a span element, to which a direction attribute can be
> attached
> > if the base direction needs to be different from the surrounding context.
> >
> >
> >
> >
> > [g] OpenType feature preservation
> >
> > Perhaps add some text to the note at the bottom of section 5 to say
> > something like this:
> >
> > "The automatic removal of opentype features such as GPOS and GSUB
> > information at any stage in the process of deploying a WOFF file is
> strongly
> > discouraged. Many writing systems around the world rely on these features
> > for very basic display of text."
> >
> > It is outside the scope of the WOFF spec, but I think having it mentioned
> here
> > will be very useful in helping people avoid this trap.
> >
> >
> >
> > ============
> > Richard Ishida
> > Internationalization Lead
> > W3C (World Wide Web Consortium)
> >
> > http://www.w3.org/International/
> > http://rishida.net/
> >
> >
> >
> >
> >
>
>
>
>
Received on Friday, 10 December 2010 17:19:12 UTC