Re: .webfont Proposal 2

On Wed, 2009-07-15 at 13:26 -0400, Tal Leming wrote:
> We (Erik van Blokland and myself) have listened to the various  
> comments to our initial .webfont proposal[1] and we've rethought some  
> things, made adjustments and more. In particular, we were intrigued by  
> Bert Bos' suggestion of a compressed directory rather than a single  
> XML file[2]. John Daggett's concerns about the file size were also a  
> good point[3]. The revised format is now a compressed file containing  
> two files with the following names:
> 
> 	info.xml
> 	fontdata


With all due respect to Bert Bos, I hope that 
you will instead reconsider using MIME.   For 
most practical purposes, you can regard MIME
format as a compressible "directory archive"
so most reasons you might have for choosing a
file archive format are also reasons for choosing
MIME.

Among the drawbacks of using a file archive format
are, notably, the introduction of metadata fields
which are inapplicable in this context such as file
timestamps, permissions, ownership data and so forth.

Among the advantages of MIME are its general consonance
with the design context and its capacity for orderly
extension.  MIME is generally consonant in this context
because of the role it already plays on the web.  MIMEs
extensibility comes from its capacity for strongly typed
components, encoding scheme flexibility, and an existing
registry for declaring new component types.

The extensibility of MIME points out a weakness of
your file archive proposal:  you are using file names
"by convention" to assert the type and meaning of the
components.  The file "info.xml" is supposed to contain
XML conforming to a specific schema and "fontdata" a 
font.   If later someone wants to generalize your 
idea and extend it for, say, JPG files instead of font
files they will have to choose some distinct name other
than "fontdata" yet there is no orderly process in place
for the allocation of such names.

Finally, if your file type is a directory archive than
an ambiguity arises as to how a browser should treat
such a file when it is linked to directly ("<a href='...'")
Ordinarily the file would often be passed to an archive
extraction program but that would not be appropriate
for these files.



> The info.xml file contains numerous bits of data describing the font  
> data. We looked at various metadata formats and we drew a lot of  
> inspiration from the Dublin Core Metadata Initiative[4]. We also spoke  
> with some foundries to determine if they would need other fields and  
> made adjustments as needed. A quick overview of the fields:

> format - This defines the format of the data in the fontdata file.  
> Required.
> name - The name of the font. Required.
> creationdate - The date this particular font file was created. Required.
> vendorname - Name of the vendor of the font. Suggested.
> vendorurl - URL for the vendor of the font. Suggested.
> designcredits - A list of designers of the font. Optional.
> license - A license for the font. Optional.
> licenseurl - URL to licensing info for the font. Optional.
> copyright - A copyright for the font. Optional.
> trademark - A trademark for the font. Optional.
> allow - A list of URLs allowed to use the font. Optional.
> description - Text describing the font. Optional.
> privatedata - Arbitrary, private information set by the vendor of the  
> font. Font vendors could use this to include any data that they deem  
> necessary. Optional.


The work of identifying the needed relations is
valuable.  The use of XML to represent those relations
is, in this case, problematic.

I would urge you to consider using HTML + RDFa instead.
In that way, the metadata may be made "human readable"
with good accessibility while still being machine readable.
UAs are spared the obligation to code up a new representation
mechanism for a specific new XML schema.  The meta-data
format will be extensible in an orderly way and not limited
in use to the "fields" of interest to a few vendors.

This is not to say that there is no value in, having
defined the ontology of relations you want to have,
defining an XML schema as one possible representation
for them.  I'm simply suggesting that (a) the relations
should be abstract (representation independent, like ccREL)
and (b) HTML+RDFa is a parsimonious representation 
to use in font files.

I would, again, urge you to use the ccREL definitions
in every case where there is overlap.  Otherwise, W3C
will find itself with two mutually redundant efforts
and, if they are alert, will urge the unification of the
two anyway.


> Many of these fields include data that may already be in the font  
> binary. We think it is advantageous to move these out of the binary  
> for several reasons: it makes it easier for users and browsers to  
> access the data, it makes it very clear what the details of  
> the .webfont are since the font file is only one part of the .webfont  
> and, since .webfont can support more than one font format, it may be  
> necessary if a future format doesn't include this data. Font vendors  
> will be able to maintain consistency between the data in the font  
> binary and the data in the XML file, so there shouldn't be concerns  
> about inconsistencies.

There are many good reasons to have this metadata
separate from the main payload, I agree.


> Further details about these elements are in the attached file.

> If it would be useful for the browsers to have information about the  
> font, we're open to adding data. For example, several font vendors  
> thought that adding a list of Unicode ranges covered by the font might  
> be useful.

A perhaps attractive attribute of the HTML+RDFa notion
is that it is conveniently extensible in downwards compatible
ways.  Even as new "fields" (I would call them "relations")
are added, human-friendly display continues to work downwards
compatibly (old browsers can at least show the user the new
fields in a nice format).


> The fontdata file would contain the actual font file. The name does  
> not have an extension so that it can be format agnostic. The format of  
> the font file is specified by the <format> element in the info.xml  
> file. Initially the only supported format would be "opentype". (This  
> covers both OTF-CFF and OTF-TTF.)

This is quite ad hoc and uncomfortably non-extensible.

MIME will afford you the opportunity to include 
a type declaration for the font payload in a standard
way that is already widely implemented.



> As stated above, these files would be compressed into a single file.  
> We propose .zip as the compression format. If there is another  
> compression format that would be easier for the browsers to implement,  
> we are open to suggestions. An informal test showed an average 40%  
> file size savings over the raw OpenType fonts.

A MIME format file can be compressed in a flexible
number of ways.


> Handling the <allow> element:
> In our previous proposal we suggested an unobtrusive alert system that  
> would be triggered if the domain being viewed did not match a domain  
> in the <allow> element. John Daggett explained that he didn't think  
> this would work[5 6]. Instead, he suggested a "page info" window that  
> would display data about the page including information defined in the  
> font's metadata[3] plus same-site origin restrictions. We are very  
> interested in hearing from the browser developers about the  
> feasibility of these two ideas.


To the extent to which you hope browsers will have mechanisms
specifically intended to convey copyright and licensing
information *as such*, I think that you *must*, to be 
realistic, be at least responsive to the ccREL effort.
Otherwise, you are starting from scratch on a redundant 
effort.


-t



> We're hopeful that this is a good format for everyone. It gives users  
> smaller file sizes. It gives the font vendors a simple format that  
> allows them to include information about the font. It doesn't require  
> entirely new technologies from the browser developers.
> 
> Once again, we'd love to know what you think.
> 
> Tal (and Erik)
> 
> 
> [1] http://lists.w3.org/Archives/Public/www-font/2009JulSep/0361.html
> [2] http://lists.w3.org/Archives/Public/www-font/2009JulSep/0386.html
> [3] http://lists.w3.org/Archives/Public/www-font/2009JulSep/0406.html
> [4] http://dublincore.org/documents/dcmi-terms/
> [5] http://lists.w3.org/Archives/Public/www-font/2009JulSep/0362.html
> [6] http://lists.w3.org/Archives/Public/www-font/2009JulSep/0365.html
> 

Received on Wednesday, 15 July 2009 20:04:15 UTC