RE: WOFF and extended metadata from Levantovsky, Vladimir on 2010-06-16 (www-font@w3.org from April to June 2010)

From: Levantovsky, Vladimir <Vladimir.Levantovsky@MonotypeImaging.com>
Date: Wed, 16 Jun 2010 16:18:41 -0400
To: Laurence Penney <lorp@lorp.org>
CC: Sylvain Galineau <sylvaing@microsoft.com>, Tal Leming <tal@typesupply.com>, Erik van Blokland <erik@letterror.com>, "www-font@w3.org" <www-font@w3.org>, 3668 FONT <public-webfonts-wg@w3.org>
Message-ID: <7534F85A589E654EB1E44E5CFDC19E3D03F3BF2019@wob-email-01.agfamonotype.org>

On Wednesday, June 16, 2010 1:26 PM Laurence Penney wrote:
> 
> Might I suggest that you and the other proposers point to existing
> successful metadata schemes that are similar to each of the three
> techniques you describe? My scheme, which you omit, is used (unnested)
> all over OpenStreetMap, and an apparently similar but nested scheme,
> Matro・ka [1], thanks to its adoption by Google in the forthcoming VP8
> codec, looks like it's on the verge of heavy usage in the field of
> video tagging, including multi-lingual subtitles and credits of the
> type Tal has already defined in WOFF.
>

Laurence,

I very much appreciate your efforts and your contributions to this group. There have been a significant number of emails exchanged on the subject of metadata, with the proposals that occupy the wide range from being closed to 'ideal solutions' (that require additional efforts to implement) to 'simple' that are limited in scope and capabilities but easy for implementers. We need to define a 'practical' solution for metadata extensions on which the consensus can be reached by all parties - the one that may not necessarily be ideal but could be considered 'good enough', and the one that can be easily integrated with the existing metadata.

> Nesting capability seems to be lacking in the three schemes below. Do
> you seriously believe this is not a significant requirement for
> metadata?
> 

I agree that nesting capabilities are important, but arbitrary nesting has been objected to by the implementers. As a trade-off, a single level of nesting was proposed to be allowed by grouping the extension key/value pairs into groups (or something similar); the current implementation of <credits> element is one example of such single-level nesting.

> Your list also seems to be lacking a means by which any metadata
> schemes initiated by one party might be adopted as a standard. (Let's
> imagine a scenario where the current WOFF metadata had not yet been
> placed in any fonts - how would a 'credit' metadata scheme be adopted
> as normative by W3C?)
> 

I expect that when a new use case arises that presents new requirements not yet addressed by the existing solution, the proposal can be made to W3C and the specification will be revised - this can happen when we publish our first public draft and this is why having _an_ extension mechanism added now is important to stimulate responses and comments from reviewers.

> I propose that our metadata scheme must be capable of elegantly
> representing anything representable as JSON - that is: numbers,
> strings, arrays and dictionaries (aka key-value pairs), to arbitrary
> levels of nesting. 

Like I said earlier, people objected to the arbitrary level of nesting, and I believe we do not yet have a use case that would justify having this as a requirement.

> In addition, because these structures will, in
> general, be sparse in terms of human language coverage, the language
> tagging should be as close to the data as possible. Whether that is
> with a lang attribute or a colon syntax is of minor concern.
> 
> Proposed use cases: tagging the Font Aid "Coming Together" font[2]
> 
> 1. Give each glyph a designer credit
> 2. Record the name of the receiving charity and the reason for the
> appeal
> 

I believe these use cases have already been addressed by the existing metadata elements (credits, credit, description) already included in the current draft.

Again, I suggest that our main goal should be to come up with the description of the extension mechanism that we can *all* agree on, and the one that could be easily integrated within the current draft. Once we publish the draft for general public to comment on, I do expect to see more use cases submitted in response, and we will be able to review and address them as part of the future work.

Thank you,
Vlad

Received on Wednesday, 16 June 2010 20:20:11 UTC