W3C home > Mailing lists > Public > public-iri@w3.org > July 2011

Re: How browsers display IRI's with mixed encodings

From: Martin J. Dürst <duerst@it.aoyama.ac.jp>
Date: Thu, 28 Jul 2011 10:45:35 +0900
Message-ID: <4E30BF3F.6020300@it.aoyama.ac.jp>
To: Leif H Silli <xn--mlform-iua@xn--mlform-iua.no>
CC: chris@lookout.net, public-iri@w3.org
On 2011/07/28 4:21, Leif H Silli wrote:
> Martin J. Dürst 27/7/'11,  13:40

>> <detour>
>> With mod_fileiri, you can have your cake and eat it too, if you get
>> all the settings right. I.e. you can keep the file names locally the
>> way you always have (e.g. D\xFCrst, I'm using 0xHH notation to express
>> that these are real bytes), but accept D%C3%BCrst externally (i.e.
>> pretend you're using UTF-8), and on top of that also accept D%FCrst
>> but externally redirect in to D%C3%BCrst (and then internally back to
>> D%FCrst). But you have to be careful to get the settings right, so it
>> may not be something the average server administrator wants to do.
>> </detour>
>
> Does it handle conversion from normalised UTF-8 to e.g. Mac filesystem
> UTF-8 too?

No, sorry. I used the iconv library for conversion, which is an 
(optional?) part of Apache, but doesn't do that kind of conversion. 
Also, a single conversion might not be enough, because in the worst 
case, the server might have file names with different 
(non-)normalizations lying around. On the Mac, that's not a problem 
because the filesystem itself handles this, but on Linux, it can happen 
(e.g. if people working on Macs and on other boxes both upload files).

Regards,    Martin.
Received on Thursday, 28 July 2011 01:46:59 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:14:42 UTC