Re: reviewing draft-weber-iri-guidelines-00

09.07.2011 22:18, Phillips, Addison wrote:
>> One additional comment.  Section 4, bullet 1.  I propose to mention that the
>> BOM character, if present in the trailing position, should be removed when
>> pre-precessing.  This is also in accordance with Unicode Standard.
>>
> No, that would probably be a bad thing to do. The trailing position in an IRI could be a piece of valid data:
>
>     http://example.com/myCharPicker?char=
Sorry, I did mean the leading position.  This was my mistake.
>
> Outside its role as an announcer---at the start of a text file---BOM is not that useful [it's job of "zero width non-breaking space" is better done by the WORD JOINER character], but it is still a valid code point that might be exchanged. There is no reason to require its removal.
When in the leading position and not removed, it might be considered to 
be a part of the scheme, making the valid full IRI a relative one.  From 
Section 5.1:

>     If the first character of the string is not an ALPHA then this is not
>     a valid scheme and the pre-processed-reference-string may be handled
>     as a relative reference.
>
>     o  Continue to "Identify the path"
>     o  Abort further scheme processing
Thanks,
Mykyta Yevstifeyev
>
> Cf. http://www.unicode.org/faq/utf_bom.html#bom6
>
> Addison

Received on Sunday, 10 July 2011 03:32:48 UTC