- From: Boris Zbarsky <bzbarsky@MIT.EDU>
- Date: Fri, 23 Jan 2009 09:25:17 -0500
- To: Marcos Caceres <marcosscaceres@gmail.com>
- CC: public-webapps <public-webapps@w3.org>
Marcos Caceres wrote: > Ok. I'll need to run this by the working group as I had something like > this in very early drafts of the spec and received criticism for being > overly prescriptive (It could have been that I wrote the text > incorrectly). Can you please suggest some text that we could use? "There may be implementation-specific limits on the range of integers allowed, and behavior outside such limits is undefined." is one option. >> That really depends on what the goal is. What _is_ the goal? > > The goals are as follows: > 1. Widget engines optionally support SVG Tiny for the icon format > (though they can have the capability to render full SVG). > 2. For the purpose of widgets, icons are written by authors to > conform to SVG Tiny (not full) > 3. Widget engines that support full, can render icons in SVG Tiny... > but, for interop, widget engines should not render icons written in > SVG Full 1.1 (unless the icon also conforms to SVG Tiny). Goal 3 is what my original comment was about, basically. It means that a widget engine cannot make use of various existing SVG implementations that just happen to support more than just SVG Tiny. In particular, it means that Gecko, say, would not be able to implement this specification without sprinkling code all over to validate SVG files against SVG Tiny (something we don't plan to do, since that's not the profile we're implementing). I understand where you're coming from with this goal, but I'm not sure it's worth the restriction it imposes. Things will get even worse once SVG Tiny 1.2 is a REC, since at that point I fully expect pretty much all SVG engines supporting SVG Tiny to implement that specification, and at that point there will be no SVG engines that can be used for Widgets at all (since all of them will render things that are not valid SVG Tiny 1.1). So unless you really mean to exclude SVG engines that happen to implement SVG Full 1.1, SVG Tiny 1.2, SVG Basic 1.1 from being used in widget implementations (possibly forcing the widget UA to ship two separate SVG engines, one for widgets and one for everything else it's doing), I think you should drop goal 3 and leave the authoring requirement. That is, just have image/svg+xml work the same way in Widgets as it does over HTTP, with the authoring requirement, presumably enforced by validators of widgets but not widget UAs, that the images conform to SVG Tiny (1.1 or any version; up to you). >> Sure. I have no real opinions on the form this would take, to be honest. > > Just to be clear, do you feel strongly that this should be a feature > in Widgets 1.0? I'd think so, yes. That would make it much easier to migrate existing web content into widgets as needed... >> A number of them presumably do sniffing by extension. Gecko certainly does >> for its jar: handling. This specification explicitly prohibits that, >> though. > > Sorry, I don't understand - we make file extension to MIME mapping a > priority over sniffing: Step 1 of section "Rules for Identifying the > MIME Type of a file" reads as follows: > > "1. If the file entry has a file extension, attempt to match the file > extension to one in the first column in the file identification table. > If there is a match, then return the MIME Type value. " When I say "sniffing by extension" I don't mean that table. I mean looking for an extension-to-type mapping anywhere it can be found. For example, Gecko will look in some built-in tables it has, in user preferences, in the list of past files the user has opened in helper applications, in the extension lists that NPAPI plug-ins install, and in the OS-wide extension registry. This is the sort of sniffing that you presumably do not want, since it leads to poor interoperability (e.g. the results depend on the user's OS configuration and the filenames of past files the user has opened in helper applications). >> That seems truly unfortunate, especially since from what I can tell ZIP >> libraries _also_ do whatever they want with character encodings. If people >> are going to be forced to write their ZIP decompressors from scratch to >> implement this specification, what exactly are the benefits of using ZIP at >> all? > > I guess the thing would be to lobby Microsoft, Apple, and others to > change/update their Zip implementations. I'm not sure how that would help, since presumably widget UAs want to link to their own ZIP libraries to perform the various validation that the spec requires, as well as to allow in-memory operation as needed.... Certainly if Gecko were implementing this specification that's what we would do. We wouldn't want to depend on whatever happens (or not) to be installed on the operating system. > The other thing is that widgets this will only be a > problem in some small segments of the market. Most people will only > write widgets in one language and distribute it amongst people who use > the same character encoding on their systems. Do we have any data to support this supposition? That's certainly how things work with web pages, and in small market segments like Western Europe there are multiple encodings in common use (ISO-8859-1 and UTF-8). Not only that, but on Mac the default filesystem encoding is UTF-8, while on Windows that's not the case last I checked (and the situation is actually rather complicated in terms of what the default is, as I recall). > This would mirror today's reality I guess. I'm not sure what this is referring to. Are there particular widget UAs out there now that behave in this way (basically just copying bytes and then treating them in some "native" charset)? > And as you said, it does open an opportunity > for a vendor to create conforming packaging tools. I guess it's not clear to me why we think that adding work for everyone in this regard is worth it. What benefits do we gain, precisely from using ZIP instead of, say, the MHTML format that was recently suggested? >>> 3. In result, convert any sequence of one or more U+000A LINE FEED >>> (LF) or U+000D CARRIAGE RETURN (CR) or U+0009 CHARACTER TABULATION >>> (tab) character into a single U+0020 SPACE. >> You probably want to include U+0020 SPACE in your list of things which are >> to be collapsed. That said, why not just use the existing "space >> characters" that's already defined in this spec? > > I guess I wrote it that way so single spaces don't get replaced with > single spaces. However, you raise a good point (that there will be > sequences of two or more space characters after the substitutions in > step 3 above has taken place). I added the following as step 4 "In > result, convert any sequence of two or more U+0020 SPACE characters > into a single U+0020 SPACE." Sure, but that still leaves the other whitespace characters (vertical tab, form feed, etc, etc) not being collapsed. Is that really desired (and if so, why?), or is it just an oversight -Boris
Received on Friday, 23 January 2009 14:26:10 UTC