- From: Marcos Caceres <marcosc@opera.com>
- Date: Thu, 22 Oct 2009 13:56:17 +0200
- To: Marcin Hanclik <Marcin.Hanclik@access-company.com>
- Cc: public-webapps <public-webapps@w3.org>
On Fri, Oct 16, 2009 at 12:06 PM, Marcin Hanclik <Marcin.Hanclik@access-company.com> wrote: > Hi Marcos, > > These are my remarks as discussed yesterday on the call. > > Comment a) > > 6.A.If all characters in the extension are outside the two ranges, then go to step 5 in this algorithm. > > Should be > > 6.A.If any of the characters in the extension is outside the two ranges, then go to step 5 in this algorithm. > > But this is also problematic since it infinitely loops the algorithm in this given case. > So it should be: > > 6.A.If any of the characters in the extension is outside the two ranges, then go to step 7 in this algorithm. I changed it to: If any character in the extension is outside the U+0041-U+005A range and the U+0061-U+007A range, then go to step 7 in this algorithm. > Another comment to 6.A: > It seems that the whole algorithm assumes that the File Identification Table is constant. > E.g. if any vendor would like to add some extension with a character outside of the given ranges (or we in W3C would like to do this in the future), then we would need to rewrite the algorithm. > > So what about this (we do not need the ranges IMHO): > 6. Attempt to case-insensitively match the value of extension to one of the values in the file extension column in the file identification table. If there is a match, then return the corresponding value from the media type column and terminate this algorithm. > That is not possible because trying to do Unicode case comparisons is a nightmare (or so I'm told). This is why we restrict to just checking for ASCII. I find it highly unlikely that we will see standardized file extensions outside the ASCII range - none exists to date and there is no evidence to suggest that they will exists in the future. > And remove 6.A and 6.B as they were. > > ***************** > Comment b) > > 4. If the first character of the name is a U+002E 'FULL STOP' character, and the file name contains no other U+002E 'FULL STOP' character then go to step 7 of this algorithm. > > What about ".jpg"? > Do you assume that this is filename and not file extension? That is correct. This behavior is *nix systems (including Mac OS X). > What about this: > 4. If the first character of the name is a U+002E 'FULL STOP' character, and the file name contains no other U+002E 'FULL STOP' character then let extension be name and go to step 6 of this algorithm. > This is not consistent with the behavior of the operating systems I have tested. > ***************** > Comment c) > > Given that the processing model is developed in prose, I think we MUST fix the ambiguity of the grammar anyway. > > Thus I suggest the following change from: > > file-name = base-name [ file-extension ] > base-name = 1*allowed-char > file-extension = "." 1*allowed-char > > to: > > file-name = 1*allowed-char > > (i.e. remove base-name and file-extension). > > The removal of ambiguity is motivated by the dependency of the WURI/WUS spec on P&C in this particular detail, so it is better to keep it right, I think. > File extension does not play any role in WURI/WUS anyway. > I think either the above change or the one in my mail below has to be implemented in the spec. Ok, removed it. -- Marcos Caceres http://datadriven.com.au
Received on Thursday, 22 October 2009 11:56:51 UTC