- From: Eric J. Bowman <eric@bisonsystems.net>
- Date: Tue, 9 Nov 2010 13:39:55 -0700
- To: Henri Sivonen <hsivonen@iki.fi>
- Cc: Larry Masinter <masinter@adobe.com>, "julian.reschke@gmx.de" <julian.reschke@gmx.de>, "www-tag@w3.org WG" <www-tag@w3.org>, Alexey Melnikov <alexey.melnikov@isode.com>
Henri Sivonen wrote: > > The problem with image/svg+xml is that after a decade of deployment > and W3C REC status, the type still isn't in the registry. Even if the > IETF experts found something wrong with the type, it would be way too > late to stop its deployment, so there's really no point in subjecting > it to expert review at this point. > The same situation exists with application/rss+xml, which also defines multiple, incompatible processing models. But that's exactly the sort of situation the IANA registry is meant to avoid, in the standards tree anyway. I believe the standards tree serves a valuable purpose, and that it would be a bad thing to let anarchy reign by stating that it doesn't matter whether the rules for media types are followed, g'head and deploy whatever, and let popular consent override technical concerns. > > Yet another failure of the registry is that text/xsl isn't registered > for XSLT. > In none of these cases do I blame the registry for such failures -- the rules for registration, and the definition of media types, had been around long enough that the failure lies with the WGs who ignored them. > > > Should these be registered even if the requirements for MIME type > > registration weren't met? Or did they meet the requirements but the > > process dropped the ball? > > I don't know what exactly has happened with the registration for each > of these types. I'm just observing that the outcome was that the > system didn't work in the sense that the registry wasn't the place > where a Web author, a Web server administrator or a Web client > software developer could go and find what the right MIME type for a > given format is. > I go to the registry to find media types that have been vetted by experts and known to meet some basic requirements. If the registry becomes a list of everything anyone wants to do, whether it meets those requirements or not, well, I'd consider that a failure of the registry. The rules are quite clear -- pending approval, prefix with x. i.e. image/x.svg+xml or application/x.rss+xml. Refusing to follow the appropriate syntax *and* ignoring what media types are supposed to do, shouldn't be rewarded with registration -- the IANA registry isn't perfect, but destroying its credibility such that nobody has any faith in any media types, sounds counterproductive to me, and would be a much larger problem than the handful of strings out there which *look* like standards-tree media types, but aren't. > > > As for image/svg+xml not being used for 'XML' format. I think this > > is a 3023bis issue? > > Do you mean sending gzipped data as image/svg+xml without > Content-Encoding: gzip? > RFC 3023(bis) say nothing about ZIP files. The media type is supposed to tell me the sender's intent, so I know how to process the payload. I don't know how anyone expects feeding a ZIP file (because this is an issue of pre-compressing the file, not Content-Encoding compressing it on-the-fly) into an XML parser to work, but that's exactly the intent being conveyed. Unless the intent being conveyed is SVG and the file isn't compressed (or is compressed on-the-fly). The registry is correct in insisting that a media type identify one, and only one, processing model. Otherwise, intermediaries have to introspect the payload to determine whether it's ZIP or XML -- defeating the entire point of exposing this *outside* the payload, in a header. Has this blatantly-obvious mistake really gone uncorrected for a decade? Is the remedy proposed by the experts (registering two media types) outrageous and non-implementable? The failure here is *not* the registry. Having recently concluded a year-long crusade on rest-discuss advocating proper use of media types, I'm aware of the problems with the registry and the registration process, but this is not one of them. There's a Simpsons analogy here -- remember Homer teaching Bart how to putt? "Keep your head down... follow through..." Bart misses the putt, so Homer sets him up again. "OK, that didn't work, so this time, lift your head and don't follow through!" The obstinate refusal by some to adhere to the fundamentals of the architecture, is not a valid reason to abandon those fundamentals and start registering anything seen in the wild in a syntactically-identical fashion to those types which did follow the rules and have been vetted. I *like* being able to tell the difference between the two. That some folks yip their putts, is not a reason to discard the fundamentals of putting for everyone. > > It seems rather implausible that there'd be more files that > accidentally have the magic number for an image file format, a video > file format, zip, gzip or PDF than there are mislabeled files in > these formats, but I don't have data based on Web crawls followed by > manual inspection. It's well known, though, that browsers, in order > to be Web-compatible, ignore the image subtype for binary formats and > sniff the magic number instead. > As I understand it, the problem with requiring magic-number sniffing to identify the sender intent, is that it doesn't work at wire-speed for intermediaries. > > > Secondly, I'm not convinced that even if it is true now that the > > right thing to do is to give up on trying to get explicit MIME type > > indicators to work. > > I agree that it's now too late to give up on MIME entirely, since we > now have types that don't have reliable magic numbers (in particular > HTML, XML, CSS and JavaScript). However, if the purpose of the > document is to document what went wrong or what could have gone > better, I think specifying magic numbers as the step forward from > HTTP 0.9 so that textual types would have been forced to have > reliable magic numbers could have lead to a more robust outcome than > the one we got. > More robust, perhaps, but less scalable. I don't think the document should speculate that another solution would have been better, because we simply can't know that's the case. > > >> " an architecture that insists on using out-of-band type data and > >> on the out-of-band type data being authoritative has largely been > >> unproductive" > > > > in what way has it been "unproductive"? > > All the time wasted due to MIME labeling failures could have been > avoided when formats have reliable magic numbers. > Resulting in a different architecture, with unknown problems (we can't know since that wasn't what was deployed), the solutions to which may or may not have wasted even more time -- and perhaps led us to adopt media types as the way forward. There's no way to know. > > >> Section "4.5. Content Negotiation" doesn't properly acknowledge > >> that content negotiation on axes other than lossless compression > >> (gzip) is mostly a failure on the Web. > > > > But "user-agent" content negotiation is widespread, common, > > and quite functional. > > "Negotiating" based on the User-Agent header isn't part of the > Accept* content negotiation design. As for it being functional, I > think it's dangerous for the adoption of standards. To give an > example that touches on what I've been working on lately, right now a > practice of sites sniffing Firefox and Opera and assuming certain > script execution behavior is threatening the convergence of all > implementations on one standardized behavior. > My experience over twelve years of implementing conneg, is that there will never be convergence on one standardized behavior. Eliminating conneg may or may not result in such convergence. If it doesn't, then there's no mechanism to account for the lack of convergence -- resulting in a failure with greater consequences than result from a lack of uptake for conneg (beyond compression). > > Also note that the Accept header of IE8 doesn't really allow > negotiation on any other practical axis except progressive JPEG vs. > not progressive, which no one cares about anymore. > Sure it does. My client-side XSLT implementation *could* just send application/xml, except that my intent is best described by application/ xhtml+xml, which is what I send -- except for IE, to which I must send application/xml. How do I detect IE? By looking for application/x- microsoft in the Accept header. Granted, that usage isn't what anyone expects, but it works for me. But mostly, I see conneg used by systems which aren't meant for consumption by browsers, so I don't think it's a broken mechanism. -Eric
Received on Tuesday, 9 November 2010 20:40:18 UTC