W3C home > Mailing lists > Public > uri@w3.org > September 2009

RE: [hybi] ws: and wss: schemes

From: Phillips, Addison <addison@amazon.com>
Date: Fri, 4 Sep 2009 10:41:26 -0700
To: Julian Reschke <julian.reschke@gmx.de>, Ian Hickson <ian@hixie.ch>
CC: URI <uri@w3.org>, "hybi@ietf.org" <hybi@ietf.org>, "uri-review@ietf.org" <uri-review@ietf.org>, "public-i18n-core@w3.org" <public-i18n-core@w3.org>
Message-ID: <4D25F22093241741BC1D0EEBC2DBB1DA01AD6282C2@EX-SEA5-D.ant.amazon.com>
Hello uri@,

[personal note, not representative of i18n wg]

> >>
> >>>    URI scheme syntax.
> >>>       In ABNF terms using the terminals from the IRI
> specifications:
> >>>       [RFC5238] [RFC3987]
> >>>
> >>>            "ws" ":" ihier-part [ "?" iquery ]
> >> That is even worse than before, because it now uses productions
> from the
> >> IRI spec defining *URI* syntax.
> >
> > ws: and wss: URLs are i18n-aware; why would we want to limit them
> to
> > ASCII?
> Because that's how URI and thus URLs are defined.

I agree with Julian. If you are defining a URI syntax, you can't use IRI to do so. Section 2.5 of URI, however, does allow what you mean here, when it says:

   When a new URI scheme defines a component that represents textual
   data consisting of characters from the Universal Character Set [UCS],
   the data should first be encoded as octets according to the UTF-8
   character encoding [STD63]; then only those octets that do not
   correspond to characters in the unreserved set should be percent-
   encoded.  For example, the character A would be represented as "A",
   the character LATIN CAPITAL LETTER A WITH GRAVE would be represented
   as "%C3%80", and the character KATAKANA LETTER A would be represented
   as "%E3%82%A2".

If 'ws:' were defined as an IRI scheme, you could then use RFC 3987 to define its mapping to a URI. This is what is done in specs like XLink 1.1. Defining 'ws:' as an IRI scheme would not necessarily be a bad thing, but I've found that confusion tends to surround when an IRI is happily being an IRI and when it needs to be mapped down to a URI.

> >
> > I've deferred to RFC3987 to sidestep this issue.
> A URI is not a IRI.
> You can refer to the mapping, but that really needs a few more
> words than "See RFC3987.".

It may not need many more words, but certainly a few more words.

Best Regards,


Addison Phillips
Globalization Architect -- Lab126
Chair -- W3C Internationalization WG

Internationalization is not a feature.
It is an architecture.

Received on Friday, 4 September 2009 17:42:11 UTC

This archive was generated by hypermail 2.4.0 : Sunday, 10 October 2021 22:17:53 UTC