W3C home > Mailing lists > Public > public-iri@w3.org > March 2005

Re: Possible erratum

From: Martin Duerst <duerst@w3.org>
Date: Sun, 06 Mar 2005 02:13:40 +0900
Message-Id: <6.0.0.20.2.20050306010047.08977e30@localhost>
To: Chris Lilley <chris@w3.org>(by way of Martin Duerst <duerst@w3.org>), public-iri@w3.org

Hello Chris,

Many thanks for your report. I have listed this issue as
http://www.w3.org/International/iri-edit#theseCharacters-102.

At 04:20 05/03/05, Chris Lilley wrote:

 >Hello public-iri,
 >
 >In section 3.1 of RFC 3987, it says
 >
 > > Systems accepting IRIs MAY also deal with the printable characters in
 > > US-ASCII that are not allowed in URIs, namely "<", ">", '"', space,
 > > "{", "}", "|", "\", "^", and "`", in step 2 above. If these characters
 > > are found but are not converted, then the conversion SHOULD fail.
 > > Please note that the number sign ("#"), the percent sign ("%"), and
 > > the square bracket characters ("[", "]") are not part of the above
 > > list and MUST NOT be converted. Protocols and formats that have used
 > > earlier definitions of IRIs including these characters
 >
 >In the third sentence, 'these characters' is ambiguous. i read it as
 >referring to the most recently mentioned list - # % [ ] - but in fact (I
 >confirmed with Martin) it refers to the first list.

It is very clear, at least to me, that the only thing that makes sense
is indeed that this refers to the first list. There are no protocols
or formats that have used "#", "%", "[", and "]". And escaping any of
them would wreck havoc on the IRI/URI itself.

But of course, the current text is at least misleading.


 >Please reword to make this unambiguous.

Here is one way to reorder the sentences in the paragraph.
Not sure yet that this is the best version, however.

    Systems accepting IRIs MAY also deal with the printable characters in
    US-ASCII that are not allowed in URIs, namely "<", ">", '"', space,
    "{", "}", "|", "\", "^", and "`", in step 2 above. If these
    characters are found but are not converted, then the conversion
    SHOULD fail. Protocols and formats that have used earlier definitions
    of IRIs including these characters MAY require percent-encoding of
    these characters as a preprocessing step to extract the actual
    IRI from a given field. This preprocessing MAY also be used by
    applications allowing the user to enter an IRI. Please note that
    the number sign ("#"), the percent sign ("%"), and the square
    bracket characters ("[", "]") are not part of the above list
    and MUST NOT be converted.


Regards,    Martin. 
Received on Saturday, 5 March 2005 17:22:19 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 30 April 2012 19:51:53 GMT