W3C home > Mailing lists > Public > www-font@w3.org > April to June 2010

Re: WOFF and extended metadata

From: Laurence Penney <lorp@lorp.org>
Date: Tue, 22 Jun 2010 20:41:01 +0100
Message-Id: <97EC9632-7355-4FF3-9D38-111FF66E9238@lorp.org>
To: www-font@w3.org, 3668 FONT <public-webfonts-wg@w3.org>
For the sake of completeness, I outline in the same manner as Jonathan the extension scheme I have been proposing. I have modified how language is specified.


<extension [id="foo"] [name="bar"]>
<!-- zero or more extension elements allowed within metadata -->

   <tag k="{key}" v="{value}">
   <!-- each tag is a key=value pair -->
   <!-- zero or more metadata items (tags) within each extension element -->
   <!-- value attribute takes character data: must encode angle-bracket markup -->
   <!-- tag elements may contain further tag elements, to unlimited depth -->

	[<translation lang="{lang_code}" v="{translated_value}"/>]
	<!-- zero or more translations of the value -->

	[<tag k="{key}" v="{value}"/>]
	<!-- zero or more child tags -->

   </tag>

</extension>


Notes:

* The name attribute of <extension> is required, since it determines the structure of the metadata it contains; the W3C or a delegated authority will accept registrations for names in this space, if the applicant commits to maintaining a public version of the spec online. UAs should refer to those publications in order to present the metadata better than in key=value form.

* The top-level <tag> element takes the place of the <item> element in Jonathan's recent sketch.

* Each <tag> element may contain <translation>s of its value. The language of its default value may NOT be specified, and may be regarded as a fall-back if there are no translations. Setting the language for a key is not allowed. By making translations subordinate to their parent, this scheme ensures all semantically identical data remains together.

* Each <tag> element may optionally contain further tag elements, to an unlimited depth, thus:

<extension name="foo">
   <tag k="{key}" v="{value}">
      <tag k="{key}" v="{value}"/>
      <tag k="{key}" v="{value}"/>
      <tag k="{key}" v="{value}">
         <tag k="{key}" v="{value}"/>
         <tag k="{key}" v="{value}"/>
      </tag>
   </tag>
</extension>

* To encode data in existing array and dictionary structures, the following schemes are recommended:

Arrays (1):

<tag k="{array_name}" v="">
   <tag k="0" v="{value}"/>
   <tag k="1" v="{value}"/>
   <tag k="2" v="{value}"/>
   <tag k="3" v="{value}"/>
</tag>

Arrays (2):

<tag k="{array_name}" v="{value_0}"/>
<tag k="{array_name}" v="{value_1}"/>
<tag k="{array_name}" v="{value_2}"/>
<tag k="{array_name}" v="{value_3}"/>
<tag k="{array_name}" v="{value_4}"/>

Dictionaries:

<tag k="{dict_name}" v="">
   <tag k="{key_1}" v="{value_1}"/>
   <tag k="{key_2}" v="{value_2}"/>
   <tag k="{key_3}" v="{value_3}"/>
</tag>

Naturally:
* <tag>s in an array may be dictionaries, arrays and simple tags;
* <tag>s in a dictionary items may be arrays and simple tags;
* simple tags may contain dictionaries and arrays.

* If other types of data need to be stored - for examples i) XML, ii) a structure that includes binary data not naturally representable as text - a custom representation must be devised that is representable by key-value pairs with strings for values. XML is fairly easy to turn into key value pairs; binary data should be serialized (e.g. in PHP using serialize(), json_encode() and base64_encode()). Decoding these strings inevitably then relies on a UA knowing explicit details of the encoding mechanism. The resort to binary and double-encoding methods is discouraged since UAs unaware of that tag spec will only be able to present the encoded string.

* At any given level in the tag hierarchy, any repeated tag with the same key should be treated as part of an array by the UA.

* Having the name attribute on the <extension> avoids the need to have a <name> element in the specification. (If it is desired to allow the extension blocks to have localizable names, then <translation>s could be allowed at the top level.)

In summary, the chief divergences from JK's specification are: extensible hierarchy; language subordinate to semantics; only strings allowed as values. The latter is currently strongly opposed by Liam, Chris and others. If those objections stand, then the value may be handled as element content without affecting the rest of this proposal.

- L
Received on Tuesday, 22 June 2010 19:41:41 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:37:34 UTC