Re: UTF-8 text

>>>>> "JMS" == James M Snell <jasnell@gmail.com> writes:

JMS> This is a more difficult question. In theory, yes, we ought to be
JMS> able to support these, but there's the question of backwards
JMS> compatibility.  We could define that the new :path field (and
JMS> referer, location, link, etc) contain a UTF-8 encoded IRIs, so for
JMS> backwards compatibility with HTTP/1, an implementation would need
JMS> to do the appropriate standard conversion to a URI. Going the other
JMS> direction, an impl could choose to leave it as a URI or convert it
JMS> to it's IRI form. I think this makes a lot of sense and has a very
JMS> clear http/2 <--> http/1 translation. So I'm +1 on it.

I also like this, bug have to ask:  Do any non-10646 IRIs encode
differently depending on language?  Ie, would forcing everything
to 10646/utf8 loose information due to character unification?

Think of the differences between the zht, zhs, jp and ko glyphs
of characters unified by 10646.

Perhaps it doesn't matter, even if so?  Or perhaps the utf8 IRI should
be accompanied by a optional language hint?

-JimC
-- 
James Cloos <cloos@jhcloos.com>         OpenPGP: 1024D/ED7DAEA6

Received on Thursday, 18 April 2013 18:12:25 UTC