W3C home > Mailing lists > Public > public-iri@w3.org > March 2005

Re: Possible erratum

From: by way of Martin Duerst <chris@w3.org>
Date: Sun, 06 Mar 2005 07:24:57 +0900
Message-Id: <6.0.0.20.2.20050306072451.089a9c90@localhost>
To: public-iri@w3.org




On Saturday, March 5, 2005, 12:13:40 PM, Martin wrote:

MD> Hello Chris,

MD> Many thanks for your report. I have listed this issue as
MD> http://www.w3.org/International/iri-edit#theseCharacters-102.

MD> At 04:20 05/03/05, Chris Lilley wrote:

  >>Hello public-iri,
  >>
  >>In section 3.1 of RFC 3987, it says
  >>
  >> > Systems accepting IRIs MAY also deal with the printable characters in
  >> > US-ASCII that are not allowed in URIs, namely "<", ">", '"', space,
  >> > "{", "}", "|", "\", "^", and "`", in step 2 above. If these characters
  >> > are found but are not converted, then the conversion SHOULD fail.
  >> > Please note that the number sign ("#"), the percent sign ("%"), and
  >> > the square bracket characters ("[", "]") are not part of the above
  >> > list and MUST NOT be converted. Protocols and formats that have used
  >> > earlier definitions of IRIs including these characters
  >>
  >>In the third sentence, 'these characters' is ambiguous. i read it as
  >>referring to the most recently mentioned list - # % [ ] - but in fact (I
  >>confirmed with Martin) it refers to the first list.

MD> It is very clear, at least to me, that the only thing that makes sense
MD> is indeed that this refers to the first list. There are no protocols
MD> or formats that have used "#", "%", "[", and "]". And escaping any of
MD> them would wreck havoc on the IRI/URI itself.

MD> But of course, the current text is at least misleading.

I agree that, when the text is fully understood, only one interpretation
makes sense. When the text is read i the process of gaining
understanding, the ambiguity delays comprehension.


  >>Please reword to make this unambiguous.

MD> Here is one way to reorder the sentences in the paragraph.
MD> Not sure yet that this is the best version, however.

I agree this is clearer.

MD>     Systems accepting IRIs MAY also deal with the printable characters in
MD>     US-ASCII that are not allowed in URIs, namely "<", ">", '"', space,
MD>     "{", "}", "|", "\", "^", and "`", in step 2 above. If these
MD>     characters are found but are not converted, then the conversion
MD>     SHOULD fail. Protocols and formats that have used earlier definitions
MD>     of IRIs including these characters MAY require percent-encoding of
MD>     these characters as a preprocessing step to extract the actual
MD>     IRI from a given field. This preprocessing MAY also be used by
MD>     applications allowing the user to enter an IRI. Please note that
MD>     the number sign ("#"), the percent sign ("%"), and the square
MD>     bracket characters ("[", "]") are not part of the above list
MD>     and MUST NOT be converted.


MD> Regards,    Martin.




--
  Chris Lilley                    mailto:chris@w3.org
  Chair, W3C SVG Working Group
  W3C Graphics Activity Lead 
Received on Saturday, 5 March 2005 22:29:03 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 30 April 2012 19:51:53 GMT