W3C home > Mailing lists > Public > public-iri@w3.org > August 2011

Re: I-D Action: draft-ietf-iri-3987bis-07.txt

From: Mykyta Yevstifeyev <evnikita2@gmail.com>
Date: Mon, 15 Aug 2011 06:54:05 +0300
Message-ID: <4E48985D.3000809@gmail.com>
To: public-iri@w3.org
Hello all,

I'd like to provide several comments on the new version of 
draft-ietf-iri-3987bis (I'll enter these issues in the tracker later).

Section 1.3.  I think these terms should be aligned with work done is 
APPSAWG - RFC3536bis, which was approved as BCP and is meant to provide 
normative glossary of i18n terms for use within IETF.

Section 1.4:

>     In text, characters outside US-ASCII are sometimes referenced by
>     using a prefix of 'U+', followed by four to six hexadecimal digits.
>     To represent characters outside US-ASCII in examples, this document
>     uses 'XML Notation'.
>     XML Notation uses a leading '&#x', a trailing ';', and the
>     hexadecimal number of the character in the UCS in between.  For
>     example,&#x44F; stands for CYRILLIC CAPITAL LETTER YA.  In this
>     notation, an actual '&' is denoted by'&amp;'.

These both notation were defined in RFC 5137, so reference to this 
document is appropriate here.  I'll let the authors to decide whether 
these explanations may be imported by reference from RFC 5137 or they 
should be left in the document's text.  Ibid:

>     To denote actual octets in examples (as opposed to percent-encoded
>     octets), the two hex digits denoting the octet are enclosed in "<"
>     and ">".  For example, the octet often denoted as 0xc9 is denoted
>     here as<c9>.

I think you should mention that octets in <> are in hex.

Section 2:

>     ipath-empty    = 0<ipchar>

This assumes that <ipchar> is RFC 5234 <prose-val>, which it isn't.  
Please see similar Erratum 2033 for RFC 3986 
(http://www.rfc-editor.org/errata_search.php?eid=2033) and Erratum 2846 
for RFC 5092 (http://www.rfc-editor.org/errata_search.php?eid=2846); I 
believe there are more, though.  Please consider changing to:

>     ipath-empty    = ""


>     path-sep       = "/"

... and use of <path-sep> in different <path-...> productions.  This is 
the issue of readability; what we have in RFC 3987 is better, IMO.  This 
could be useful if there were two or more possibilities for <path-sep>, 
but not here.  Ibid:

>     iquery         = *( ipchar / iprivate / "/" / "?" )
>     ifragment      = *( ipchar / "/" / "?" )

Having compared these two productions, I'd like to ask whether there is 
some reason for not allowing <iprivate> in <ifragment>?

Section 3.3:

>     For each character which is not allowed anywhere in a valid URI apply
>     the following steps.

Should it be "IRI" here?  All <ucschar>s aren't allowed in URI; so IRI 
is a better term.  Also related to pct-encoding: shouldn't the document 
specify how are non-ASCII chars going to be pct-encoded (e.g. <iprivate> 
in <ifragment>; but see above); as currently the <pct-form> production 
allows only 2 hex digits after percent sign whereas up to 6 may be 

Section 3.4: I may be wrong, but why we mandate use of pct-encoding when 
mapping <ireg-name> with SHOULD and IDNA procedure is optional?  If 
there are some reasons for preferring the former, please explain it in 
the document.

Section 3.6 specifies that Sections 3.4 and 3.5 define how to map IRI to 
URI.  Section 3.4 and 3.5 should thus go into Section 3.6.

Section 6 (chars not allowed) contains the "Private use codepoints" 
bullet; however ABNF allows such chars to be present in IRIs.  Please 
align these two sections.


>        Tags (U+E0000-E0FFF): These characters provide a way to language
>        tag in Unicode plain text.  They are not appropriate for IRIs
>        because language information in identifiers cannot reliably be
>        input, transmitted (e.g. on a visual medium such as paper), or
>        recognized.

You should note here, IMO, that use of tag chars is now deprecated with 
reference to RFC 6082.

Please update reference to ISO 10646 so that the newest version - 
10646:2011 was referred to.

Mykyta Yevstifeyev

14.08.2011 8:08, internet-drafts@ietf.org wrote:
> A New Internet-Draft is available from the on-line Internet-Drafts directories. This draft is a work item of the Internationalized Resource Identifiers Working Group of the IETF.
> 	Title           : Internationalized Resource Identifiers (IRIs)
> 	Author(s)       : Martin Duerst
>                            Michel Suignard
>                            Larry Masinter
> 	Filename        : draft-ietf-iri-3987bis-07.txt
> 	Pages           : 39
> 	Date            : 2011-08-13
Received on Monday, 15 August 2011 03:54:01 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:14:42 UTC