- From: Chris Lilley <chris@w3.org>
- Date: Wed, 23 Oct 2002 15:46:53 +0200
- To: Terje Bless <link@pobox.com>
- CC: W3C Validator <www-validator@w3.org>
On Wednesday, October 23, 2002, 9:53:34 AM, Terje wrote: TB> Chris Lilley <chris@w3.org> wrote: >>A quick test (of a well-formed but non-valid UTF-8 encoded SVG >>document) revealed: >> >> Note: The HTTP Content-Type field did not contain a "charset" >> attribute, but the Content-Type was one of the XML text/* >> sub-types. The relevant specification (RFC 3023) specifies a strong >> default of "us-ascii" for such documents so we will use this value >> regardless of any encoding you may have indicated elsewhere. If you >> would like to use a different encoding, you should arrange to have >> your server send this new encoding information. >> >>Firstly, that is neither desirable, nor an improvement. TB> I think that is arguable. What's happening is that the Validator is being TB> more strict about proper usage of the various way Character Encoding can be TB> specified. I agree that it is arguable, and the TAG amongst others is arguing about it. Strictness is fine; assuming things in the absence of evidence is not, however. >>Plus, its arguably not true (the file was sent from local disk using >>file upload, so its a mystery where the 'HTTP Content-type' field came >>from or how it figured out that a 'text/*' type had been sent. TB> Since HTTP is the only protocol supported for uploading files to the TB> Validator, I think it's safe to assume that your browser used HTTP to TB> submit the file. No? :-) True, good point. I had not thought of file upload as bing a separate HTTP transfer, but you are right. It would be interesting to know exactly what the HTTP transfer looked like, what the headers were. Do we have any test setup that I could upload a file to from various browsers to see what they do? TB> IOW, your browser submitted the file with some text/* sub type (probably TB> text/html or text/xml), which has a strong default for us-ascii in the TB> absense of a specific character encoding indication. But, its an svg file and the MIME type for svg is image/svg+xml. Its set up that way on my machine, too. ASo, where did the validator get text/html or text/xml from? TB> However, it may be that the weak support for file uploads in current TB> browsers justifies special rules for files submitted via file upload. Possibly. I would rather know more about what headers are currently send in a file upload before arguing either for or against special rules. TB> I'd rather avoid having more special case rules then necessary, Agreed TB> but it's an avenue that could be explored if this turns out to be TB> a problem. TB> The best option is of course to ensure that all servers and browsers TB> implement proper support for using HTTP Content-Type and the charset TB> attribute correctly. Yes; but in the case of an upload, there is no server on the content-originating end of the transfer as it is client to server - so, there is no server to be set up correctly. Should the accepting server apply its own setup (eg, filename extension to MIME type mapping) to the received content? -- Chris mailto:chris@w3.org
Received on Wednesday, 23 October 2002 09:46:55 UTC