Re: "wrapper format" (was: restarting discussion)

Dave Crossland reminded me that I had 
forgotten to reply to this critique by
Robert O'Callahan.

I apologize for the delay in responding.

On Sat, 2009-07-04 at 14:15 +1200, Robert O'Callahan wrote:
> On Sat, Jul 4, 2009 at 6:25 AM, Dave Crossland <dave@lab6.com> wrote:
>         2009/7/1 Christopher Fynn <cfynn@gmx.net>:
>         > Why do we need another "wrapper format" to contain font +
>         meta-data??
>         
>         
>         Because the need for better metadata is required by various
>         files
>         found on the web, not just fonts; but fonts prompt a solution
>         since
>         they are new to the web. http://noeot.com/mame.html explains
>         in more
>         detail.
 
> I am deeply skeptical of this approach. The use cases are not
> compelling, the need for a new format is not justified, and specific
> technical details are broken and clearly have not been tested at all.

Let's look at how you elaborate those:


> The use cases are not compelling. Take for example this paragraph of
> http://noeot.com/notices.html:
>         For example, consider an HTML page which links to a CSS style
>         sheet. Our proposal allows the CSS style sheet author to
>         attach user-oriented meta-data which user agents (e.g.
>         browsers) should normally display. Our proposal allows the
>         origin server and any proxies providing the CSS to provide
>         additional meta-data. One use case: a CSS author could attach
>         a copyright declaration and license information as meta-data
>         to a CSS program. That meta-data would be available to users
>         from pages which link to that style sheet.

> I assure you that almost no users care about such metadata, and
> therefore browsers will not "normally display" it. At best it might be
> visible in a Page Info window or somesuch. The trend in browsers and
> user interfaces is to display only what is immediately relevant to the
> user. Metadata isn't.

The proposal is to make the meta-data available 
in the form of something like a "Resource Info"
window.   For example, if a page links to and uses
a web font, a button-like control might appear
at the bottom of the screen for "about this font".
Clicking that button would display the metadata that
came with the wrapped font.

The proposal has also, on this list, been elaborated
with regards to metadata that contains ccREL info
using RDFa format (metadata specifically about copyright
and licensing in a standard format).  Suppose that the
browser offers, in addition to a "save page as" function,
a "save this font" function (or "save this image" etc.).
If a user selects "save this font" and the font is in 
a wrapper containing ccREL info, perhaps a dialog box
should appear saying "This font is Copyright (C) .....;
Permission is not given for copying, redistributing, 
or creating derived works from the font.  Save anyway? [Y/N]"

As to whether or not "users care" about metadata
I can only say that we don't know, there is probably
no simple answer, and your (Robert's) assurances
that you know how users feel about the issue don't 
persuade me.

I do know that for some media types, such as 
music, users care a great deal about metadata.
Audio file players contain features for the display
of such things as artist and track names.  I know
that users are also seen to care about image metadata,
at least insofar as Flickr's display of ccREL information
is an indication.   It is certainly true that people 
often watch films without paying attention to the closing
credits - an example of not caring about metadata - but
it is also true that imdb, largely a database of metadata
from film credits, is a popular tool.

None of that proves that "users care" about metadata
in a media file wrapper but those examples do show
that it is quite plausible that many users will care.



>         Simultaneously, the operator of a server offering CSS or
>         proxying could attach meta-data announcing that due to a
>         planned outage the resource will not be available next
>         Thursday between noon and 3PM, PST.

> I hope I don't have to explain how absurd this is.


Only if you care to persuade because I don't see any
absurdity to it.



> Every file format used on the Web today is extensible enough to
> support inclusion of arbitrary metadata that will not disrupt existing
> Web browsers.


There are three problems with that statement.

First, it is not strictly true.

Second, processors can not simultaneously 
attach metadata by inserting it into a media
file and convey a faithful copy of the media
file unless receiving processors are required
to regenerate the faithful copy by removing the
metadata and regenerating the media file.
A wrapper format does not have that problem.
Consider, for example, a case in which the metadata
is supposed to contain a checksum of the media 
file and compare and contrast how the checksum 
would be verified using the two approaches (insertion
vs. wrapping).

Third, it is expected that processors most often
treat this metadata in a generic way, largely
independent of the media file type.  What you propose
is a system in which processors must constantly be
extended with new metadata extraction code for each
new media format handled - the wrapper approach does
not have that problem.

>  http://noeot.com/notices.html tries to motivate the need for a
> universal wrapper format, saying
>         This is a disasterous approach for it implies that if we have
>         N file formats (or "resource representation formats" in
>         W3C-speak) that we must have N standards efforts to add
>         meta-data support.

> Libraries for reading and writing all the commonly used formats
> already exist. Gluing them together is not a lot of work compared to
> designing, evangelizing and deploying a totally new format.

Your claim appears to be that modifying many libraries
is easier than writing a single new library for a 
wrapper format.  A wrapper library would likely comprise
little more than a thin layer atop existing MIME 
libraries.   Your claim about the relative complexity
seems plainly false.


Additionally, you overlook the problem of authoring
this new metadata.  Using a wrapper, generic authoring
tools can be developed that work for all types of file.
Using your scheme, authoring tools are limited to only
those file types for which they have been explicitly
programmed.


>         It implies that a conforming web tool must have N separate
>         libraries, one each to understand each format.

> A tool that processes a format already has libraries to understand
> that format. 

See above, however.  



>         It implies that consensus about the form meta-data should take
>         is difficult to arrive at because any concessions by
>         stake-holders to adjust their own formats can come only at the
>         expense of revising their existing standards.

> Untrue. Registering new chunk types is generally not strenuous.

You are continuing under the fallacy that
all multimedia file types have "chunk types" - 
which is false.  For example, a postscript
file does not contain "chunks".



> There aren't many technical details here, but proposal for extending
> HTML is deeply flawed. In particular there's a proposal for including
> notices in the HTML <head> element like so:
> <head><notice>Notice</notice></head>

As the discussion evolved on this list, that
provision was dropped.   I apologize for the 
confusion.  We are talking only about a wrapper
format and the "<notice>" and "<acknowledge>" 
extensions to HTML is essentially off the table.


> In Firefox, Safari and HTML5 the parser moves the <notice> into the
> <body> so it's rendered at the top of the page. Even worse, adding
> metadata to links is supposed to work like this:
> <head><link><acknowledge>Acknowledgement</acknowledge></link></head>
> But <link> is a leaf element, so in the DOM the <acknowledge> element
> is not a child of the <link>. And HTML parsers move it into the <body>
> too. Minimal testing would have uncovered these problems, which
> undermines my confidence in the rest of the proposal.

I'm sorry you were mislead into thinking that
the "<notice>" and "<ack>" parts were still in play.

I'm sure that your report of the non-viability
of the particular mechanism described there is true,
although it is no longer relevant to the "wrapper
proposal".

That said, the arguments on "noeot.org" are correct
in the abstract.  It is desirable to have mechanisms
by which:

1) wrapped metadata can declare portions of itself
to be, in some way, "urgent".

2) linking documents can contain declarations which
modulate the presentation of "urgent" metadata,
in particular suppressing the urgency in some cases.

A better mechanism than the "<notice>" and "<ack>"
elements is no doubt possible.  Indeed, were I
writing that proposal today I would likely have used
RDFa rather than introducing new elements.  That
would be upward compatible and architecturally 
more clean.

Nevertheless, there is no need to initially worry
about notifications and acks in linking documents.
The proposal as it stands is streamlined to a simple
MIME-based wrapper with HTML metadata and an arbitrary
type, binary payload.


-t





> Rob
> 
> -- 
> "He was pierced for our transgressions, he was crushed for our
> iniquities; the punishment that brought us peace was upon him, and by
> his wounds we are healed. We all, like sheep, have gone astray, each
> of us has turned to his own way; and the LORD has laid on him the
> iniquity of us all." [Isaiah 53:5-6]

Received on Wednesday, 15 July 2009 17:32:48 UTC