- From: Mykyta Yevstifeyev <evnikita2@gmail.com>
- Date: Sun, 10 Jul 2011 06:33:02 +0300
- To: "Phillips, Addison" <addison@lab126.com>
- CC: "public-iri@w3.org" <public-iri@w3.org>, "chris@lookout.net" <chris@lookout.net>
09.07.2011 22:18, Phillips, Addison wrote: >> One additional comment. Section 4, bullet 1. I propose to mention that the >> BOM character, if present in the trailing position, should be removed when >> pre-precessing. This is also in accordance with Unicode Standard. >> > No, that would probably be a bad thing to do. The trailing position in an IRI could be a piece of valid data: > > http://example.com/myCharPicker?char= Sorry, I did mean the leading position. This was my mistake. > > Outside its role as an announcer---at the start of a text file---BOM is not that useful [it's job of "zero width non-breaking space" is better done by the WORD JOINER character], but it is still a valid code point that might be exchanged. There is no reason to require its removal. When in the leading position and not removed, it might be considered to be a part of the scheme, making the valid full IRI a relative one. From Section 5.1: > If the first character of the string is not an ALPHA then this is not > a valid scheme and the pre-processed-reference-string may be handled > as a relative reference. > > o Continue to "Identify the path" > o Abort further scheme processing Thanks, Mykyta Yevstifeyev > > Cf. http://www.unicode.org/faq/utf_bom.html#bom6 > > Addison
Received on Sunday, 10 July 2011 03:32:48 UTC