W3C home > Mailing lists > Public > public-iri@w3.org > July 2011

Re: reviewing draft-weber-iri-guidelines-00

From: Mykyta Yevstifeyev <evnikita2@gmail.com>
Date: Sun, 10 Jul 2011 06:33:02 +0300
Message-ID: <4E191D6E.4040706@gmail.com>
To: "Phillips, Addison" <addison@lab126.com>
CC: "public-iri@w3.org" <public-iri@w3.org>, "chris@lookout.net" <chris@lookout.net>
09.07.2011 22:18, Phillips, Addison wrote:
>> One additional comment.  Section 4, bullet 1.  I propose to mention that the
>> BOM character, if present in the trailing position, should be removed when
>> pre-precessing.  This is also in accordance with Unicode Standard.
>>
> No, that would probably be a bad thing to do. The trailing position in an IRI could be a piece of valid data:
>
>     http://example.com/myCharPicker?char=&#xFEFF;
Sorry, I did mean the leading position.  This was my mistake.
>
> Outside its role as an announcer---at the start of a text file---BOM is not that useful [it's job of "zero width non-breaking space" is better done by the WORD JOINER character], but it is still a valid code point that might be exchanged. There is no reason to require its removal.
When in the leading position and not removed, it might be considered to 
be a part of the scheme, making the valid full IRI a relative one.  From 
Section 5.1:

>     If the first character of the string is not an ALPHA then this is not
>     a valid scheme and the pre-processed-reference-string may be handled
>     as a relative reference.
>
>     o  Continue to "Identify the path"
>     o  Abort further scheme processing
Thanks,
Mykyta Yevstifeyev
>
> Cf. http://www.unicode.org/faq/utf_bom.html#bom6
>
> Addison
Received on Sunday, 10 July 2011 03:32:48 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:14:42 UTC