Re: Request for Comments: Last Call WD of Widgets 1.0: Packaging & Configuration spec; deadline 31 Jan 2009 from Boris Zbarsky on 2009-01-22 (public-webapps@w3.org from January to March 2009)

From: Boris Zbarsky <bzbarsky@MIT.EDU>
Date: Thu, 22 Jan 2009 10:14:23 -0500
To: Marcos Caceres <marcosscaceres@gmail.com>
CC: public-webapps <public-webapps@w3.org>
Message-ID: <49788D4F.5040508@mit.edu>
Marcos Caceres wrote:
> Ok, I've removed it. This may cause implementations to override files
> on systems that don't support case insensitive file names. This should
> not be a real problem, as most file system won't let you create files
> with the same name but different cases. And, on Windows at least, if
> you try to add a file to a zip archive that already contains a file
> with the same name (regardless of case), it will ask you to override
> the file.

That sounds fine to me.  An informative note may be merited.  On the 
other hand, you could require widget UAs to not do such overwriting 
(e.g. use memory if the filesystem can't deal).  Might be worth it.

>> 3)  When parsing a non-negative integer (Section 8.2, step 8), what's the
>> expected behavior for integers larger than 2^32?  2^64?  Are implementations
>> of this specification required to do integer arithmetic on arbitrarily large
>> integers?  If not, is the behavior just implementation-dependent?
> 
> I think that is an implementation detail.

That should be mentioned explicitly, in my opinion.

>> 4)  Section 8.2, step 8, it would be good to make sure that the image
>> identification table matches the one in HTML5 (possibly by having both
>> specifications refer to a single table, if that's workable).
> 
> The tables match (because I ripped the values straight from HTML5).
> HIxie and A. Barth are working on a separate internet draft for
> sniffing [1]. We will probably end up referencing that.

Sounds good.

>> 5)  Section 8.2, step 8, I'm not sure why image/svg+xml is required to be
>> processed according to SVGTiny.  This means that an SVG 1.1 or SVG 1.2 Full
>> (whenever that happens) user-agent cannot implement this specification, as
>> far as I can see.
> 
> Hmmm... that's not what I meant. Is SVGTiny a subset of 1.1 or 1.2?

The SVGTiny _language_ you cite is a subset of SVG 1.1, unless someone 
screwed up.  Content authored within the constrains of SVG Tiny 1.1 
should render identically in SVG Tiny 1.1 and SVG Full 1.1 UAs.

However, since SVG requires that "unknown" attributes and tags cause 
things not to render, a UA that processes according to SVG Tiny 1.1 will 
in fact render various SVG Full 1.1 markup differently than a UA that 
processes according to SVG Full 1.1, if I understand the setup correctly.

> How do you recommend we proceed here?

That really depends on what the goal is.  What _is_ the goal?

>> 6)  Section 6.2 talks about using file extensions followed by content-type
>> sniffing to determine MIME types.  This sounds to me like the exact process
>> is up to the UA.  Then Section 8.2, step 8, has specific lists of extensions
>> and magic numbers that UAs need to recognize.  Is the sniffing allowed in
>> Section 6.2 required to be a superset of what Section 8.2 allows?  If so,
>> this should be made clearer.
> 
> Understood. I added the following text:
> "For sniffing the content type of images formats supported by this
> specification, a widget user agent must use the Rules for Identifying
> the MIME type of an Image. For other file formats supported by the
> specification, a widget user agent must use the Rules for Identifying
> the MIME Type of a file."

Sounds good.

> We might need a manifest format... something like:
>    <manifest>
>       <resource type="some/type" src="/path/to/file" />
>   <manifest>
> 
>  Or, better still...
> 
> <mediatypes>
>    <type name="some/type" extension="gif"/>
> </mediatypes>
> 
> Or a mix of both solutions.

Sure.  I have no real opinions on the form this would take, to be honest.

> We had thought about deferring that feature to version 2.0 (not widget
> engine on the market has required such a manifest thus far because
> they all seem to just rely on sniffing).

A number of them presumably do sniffing by extension.  Gecko certainly 
does for its jar: handling.  This specification explicitly prohibits 
that, though.

> Because the Zip spec mandates CP437 unless the implementation supports
> version 6.3 or above of the Zip spec. Sadly, most Zip implementations
> do whatever they want when it comes to character encoding. This is
> probably the biggest barrier to interoperability of packaging.

That seems truly unfortunate, especially since from what I can tell ZIP 
libraries _also_ do whatever they want with character encodings.  If 
people are going to be forced to write their ZIP decompressors from 
scratch to implement this specification, what exactly are the benefits 
of using ZIP at all?

>> In the same algorithm, there's mention of "the input's text nodes". This
>> relationship is not defined in this specification or elsewhere.  I assume
>> you mean the text nodes which have input as their ancestor, right?
> 
> The "input" is the element being processed.

That doesn't answer my question.  There is no concept of "this element's 
text nodes" in the DOM that I know of.  There are concepts like "parent 
node", "child nodes", "next sibling", "previous sibling", etc.  You 
presumably want to express whatever you're trying to say in terms of those.

> Agreed. Ok, that section was totally screwed:) I've rewritten the the
> algorithm and added a new algorithm that normalizes the white space:

> 2. If the widget user agent supports [ITS]: If the element has the dir
> attribute from the [ITS] namespace with a valid its:dir value, then
> process its text nodes in accordance to the [ITS] specification.

This still has this "its text nodes" thing.  Presumably you mean "its 
descendant text nodes" or something?

> 3. In result, convert any sequence of one or more U+000A LINE FEED
> (LF) or U+000D CARRIAGE RETURN (CR) or U+0009 CHARACTER TABULATION
> (tab) character into a single U+0020 SPACE.

You probably want to include U+0020 SPACE in your list of things which 
are to be collapsed.  That said, why not just use the existing "space 
characters" that's already defined in this spec?

> Ok, turns out that the Rules for Removing White Space are not actually
> needed anywhere (and would have cause problems because "10 00 11"
> would have been interpreted as "100011" instead of an error). I
> rewrote the Rules for Parsing Non-Negative Integer  skip space
> characters instead (as should have been in the first place, and as if
> defined in HTML5).

Sounds good (and algorithm looks good).

Thanks for the quick response,
Boris
Received on Thursday, 22 January 2009 15:15:07 UTC