W3C home > Mailing lists > Public > public-webapps@w3.org > October to December 2009

RE: [widgets] Potential bug in Rule for Identifying the Media Type of a File

From: Marcin Hanclik <Marcin.Hanclik@access-company.com>
Date: Thu, 22 Oct 2009 23:05:33 +0200
To: Marcos Caceres <marcosc@opera.com>
CC: public-webapps <public-webapps@w3.org>
Message-ID: <FAA1D89C5BAF1142A74AF116630A9F2C2890BCA533@OBEEX01.obe.access-company.com>
Hi Marcos, All,

>>It seems more logical to me to not
>>treat it as an extension. Look at all the .whatever files on your
>>system. I bet you 2 beers that 99% will be text files. And I bet you
>>will ".whatever.ext" will identify a type (like .something.plist).
I actually agree with this argumentation.
Even taking the implementation of ls [1] and sorting by file extension, it seems that file extension is fully abstract term that does not fit to the usually hidden files that start with the dot.
So I am ok with the current P&C TSE and await being able to comment the next LCWD asap.

Thanks,
Marcin


[1] http://www.koders.com/c/fid5323BD5A5C27DBA053F42826EEA5EE8617B34335.aspx#L3064
________________________________________
From: marcosscaceres@gmail.com [marcosscaceres@gmail.com] On Behalf Of Marcos Caceres [marcosc@opera.com]
Sent: Thursday, October 22, 2009 6:12 PM
To: Marcin Hanclik
Cc: public-webapps
Subject: Re: [widgets] Potential bug in Rule for Identifying the Media Type of  a File

2009/10/22 Marcin Hanclik <Marcin.Hanclik@access-company.com>:
> Hi Marcos,
>
>>>To be clear: All we want to do is check if the file extension of a
>>>file case-insensitively matches one of the extensions in the File
>>>Identification Table. If you can't match it, then the MIME type gets
>>>resolved with SNIFF.
> Ok, I understand the intention of this section.
>
> The ranges are an implementation detail (optimization/efficiency of some implementation, not a MUST for all).
> So in general all the comments about Unicode comparison/difficulty etc are irrelevant.
> Thus ranges as well.

Ok, cool. We are in agreement.

> Then the only really disputable thing is whether ".jpg" should be sniffed (your proposal) or whether it is to be interpreted as pure file extension (my proposal).
> In my argumentation I showed that on *nix/*inux systems ".jpg" is a file extension to support the interpretation as pure file extension.
>

Yes, and on my Mac, it was not. It seems more logical to me to not
treat it as an extension. Look at all the .whatever files on your
system. I bet you 2 beers that 99% will be text files. And I bet you
will ".whatever.ext" will identify a type (like .something.plist).

> The suggestion to remove ranges aims at facilitating any extensions/additions to the spec. E.g. if we would like to add ".p12" or Unicode extension to the File Identification Table, we should only have to add it there and not change the processing algorithm.
>

I understand the rationale, but I don't see it as necessary. Lets just
cover what is in the spec. In version 2, if we need to support this
later, we can add it easily. It won't break backwards compat because
we will just be expanding the range.

--
Marcos Caceres
http://datadriven.com.au

________________________________________

Access Systems Germany GmbH
Essener Strasse 5  |  D-46047 Oberhausen
HRB 13548 Amtsgericht Duisburg
Geschaeftsfuehrer: Michel Piquemal, Tomonori Watanabe, Yusuke Kanda

www.access-company.com

CONFIDENTIALITY NOTICE
This e-mail and any attachments hereto may contain information that is privileged or confidential, and is intended for use only by the
individual or entity to which it is addressed. Any disclosure, copying or distribution of the information by anyone else is strictly prohibited.
If you have received this document in error, please notify us promptly by responding to this e-mail. Thank you.
Received on Thursday, 22 October 2009 21:06:33 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 18:49:34 GMT