W3C home > Mailing lists > Public > uri@w3.org > September 2009

about: scheme; Simplified Encoding Considerations

From: Joseph A Holsten <joseph@josephholsten.com>
Date: Mon, 21 Sep 2009 00:52:46 -0500
Message-Id: <9E1BA1C0-F936-4202-885C-CA26FA86DF60@josephholsten.com>
To: uri-review@ietf.org
URI people:

I intend to replace the current about: scheme Encoding  

    Because many characters are not permitted with this syntax, the
    "segment" and "query" elements may contain characters from the
    Unicode Character Set [UCS] as suggested by URI [RFC3986], by first
    encoding those characters as octets to the UTF-8 character encoding
    [RFC3629]; then only those octets that do not correspond to
    characters in the unreserved set should be percent-encoded.

    By using UTF-8 encoding, there are no known compatibility issues  
    mapping Internationlized Resource Identifiers to about URIs  
    to [RFC3987].  Since about URIs do not use domain names, "ireg-name"
    conversion is unnecessary.

with the following (adapted from hixie's ws: scheme[2]):

    Characters in the "segment" or "query" parts that are excluded by  
    syntax defined above must be converted from Unicode to ASCII by  
    encoding the characters as UTF-8 and then replacing the  
    bytes using their percent-encoded form as defined in the URI and IRI
    specifications. [RFC3986] [RFC3987]

Any objections or issues?

1: http://tools.ietf.org/html/draft-holsten-about-uri-scheme-02#section-4
2: http://tools.ietf.org/html/draft-hixie-thewebsocketprotocol-41#section-8.1

Joseph Holsten
Received on Monday, 21 September 2009 05:53:30 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:25:13 UTC