W3C home > Mailing lists > Public > uri@w3.org > December 2006

Re: [Uri-review] Re: msrp and msrps URI scheme review

From: Martin Duerst <duerst@it.aoyama.ac.jp>
Date: Fri, 29 Dec 2006 12:49:16 +0900
Message-Id: <6.0.0.20.2.20061229115846.074dd0c0@localhost>
To: Ted Hardie <hardie@qualcomm.com>, "Hisham Khartabil" <hisham.khartabil@gmail.com>, uri-review@ietf.org, uri@w3.org
Cc: Rohan Mahy <rohan@ekabal.com>, Adam Roach <adam@estacado.net>, Jon Peterson <jon.peterson@neustar.biz>, Ben Campbell <ben@estacado.net>, Robert Sparks <rjsparks@estacado.net>

Hello Ted, Hisham, others,

A few more comments on the msrp and msrps URI schemes, first
interspersed with your comments and then as additional comments.

At 04:59 06/12/20, Ted Hardie wrote:
>Reading through the definition, I am particularly concerned that we get
>comments on the drafts' discussions of escaping.  As it stands now, the
>draft says:
>
>>
>> If a userinfo component exists, it
>>   MUST be constructed only from "unreserved" characters, to avoid a
>>   need for escape processing.  Escaping MUST NOT be used in an MSRP
>>   URI.  Furthermore, a userinfo part MUST NOT contain password
>>   information.
>>
>>      The limitation of userinfo to unreserved characters is an
>>      additional restriction to the userinfo definition in RFC3986.
>>      That version allows reserved characters.  The additional
>>      restriction is to avoid the need for escaping.
>>
>>   The following is an example of a typical MSRP URI:
>>
>>      msrp://host.example.com:8493/asfd34;tcp
>>
>
>RFC 3986 changed most instances of "escaped" to percent-encoded,
>so it is probably better to use that language.

Yes.

>There are also a couple
>of questions which come up.  RFC 3986 allows percent-encoding to be
>used in reg-name,  so that it can be used to represent a host using
>characters outside of the ASCII range:
>
>  host        = IP-literal / IPv4address / reg-name
> reg-name    = *( unreserved / pct-encoded / sub-delims )
>
>Does the MSRP spec intend to forbid this usage of percent encoding, as
>well as that in the userinfo portion?

That would be a bad idea. I don't see the reason for excluding
escaping. Any device may already have some identifier for a user,
which may not be within the URI unreserved characters. In that
case, there are three alternatives:
a) Prohibit escaping, so that implementers have to invent their
   own escaping/mapping mechanism.
b) Allowing escaping, which leads to straightforward mappings.
c) Using IRIs, which means that for a lot more characters escaping
   isn't needed.

The current draft chooses a), but in my view, both either b) or c)
are clearly preferable. Otherwise, implementers will just 'reinvent'
escaping.

>Further, the document is limiting the characters such that no characters
>are used which would *require* percent encoding.  It is possible, however, to
>apply percent encoding to unreserved characters.  I believe it may be
>necessary to describe the appropriate behavior when encountering
>a percent encoded character which is within the permitted range.

Yes. Issues like these make it even more preferable to just allow
escaping.


>>We will be requesting registration of the msrp and msrps schemes
>>defined in the following Internet Draft, section 15.5:
>>
>>http://www.ietf.org/internet-drafts/draft-ietf-simple-message-sessions-18.txt
>>
>
>>As part of the procedure, we are required to request a review for the
>>schemes. Please review the schemes and send comments back to folk on
>>the CC list no later than January 14th, 2007.


Below my additional comments [draft text is indented by three spaces]:

   6.  MSRP URIs

   URIs using the "msrp" and "msrps" schema are used to identify a

The plural of 'schema' is 'schemata' (classic) or alternatively 'schemas'.

   session of instant messages at a particular MSRP device.  MSRP URIs
   are ephemeral; an MSRP device will generally use a different MSRP URI
   for each distinct session.  An MSRP URI generally has no meaning
   outside of the associated session.

   An MSRP URI follows a subset of the URI syntax in Appendix A of
   RFC3986 [10], with a scheme of "msrp" or "msrps".  The syntax is
   described in Section 9.

   MSRP URIs are primarily expected to be generated and exchanged
   between systems, and are not intended for "human consumption".
   Therefore, they are encoded entirely in US-ASCII.

   The constructions for "userinfo", and "unreserved" are detailed in
   RFC3986 [10].  In order to allow IPV6 addressing, the construction
   for hostport is that used for SIP in RFC3261.

There is no need to use RFC 3261. RFC 3986 'hostport' includes
IPV6 addresses, so it should be used as a reference.


                                                  URIs designating MSRP
   over TCP MUST include the "tcp" transport parameter.

      Since this document only specifies MSRP over TCP, all MSRP URIs
      herein use the "tcp" transport parameter.  Documents that provide
      bindings on other transports should define respective parameters
      for those transports.

The syntax for 'transport' is
   transport = "tcp" / ALPHANUM
This means "tcp" or a single letter or digit. My guess is that what's
intended is something like
   transport = "tcp" / 1*ALPHANUM
i.e. one or more letters/digits. Please check/fix.


   An MSRP URI hostport field identifies a participant in a particular
   MSRP session.  If the hostport contains a numeric IP address, it MUST
   also contain a port.

What if hostport is not numeric? Is there a default port?

                         The session-id part identifies a particular
   session of the participant.  The absence of the session-id part
   indicates a reference to an MSRP host device, but does not
   specifically refer to a particular session.

This part here contradicts the text at the start of section 6 that
these URIs are all ephemeral.

   A scheme of "msrps" indicates that the underlying connection MUST be
   protected with TLS.

   MSRP has an IANA-registered recommended port defined in Section 15.4.
   This value is not a default, as the URI negotiation process described
   herein will always include explicit port numbers.

So this means also for named hosts?

                                                      However, the URIs
   SHOULD be configured so that the recommended port is used whenever
   appropriate.  This makes life easier for network administrators who
   need to manage firewall policy for MSRP.

   The hostport component will typically not contain a userinfo
   component, but MAY do so to indicate a user account for which the
   session is valid.  Note that this is not the same thing as
   identifying the session itself.  If a userinfo component exists, it
   MUST be constructed only from "unreserved" characters, to avoid a
   need for escape processing.  Escaping MUST NOT be used in an MSRP
   URI.

See comments above.

         Furthermore, a userinfo part MUST NOT contain password
   information.

      The limitation of userinfo to unreserved characters is an
      additional restriction to the userinfo definition in RFC3986.
      That version allows reserved characters.  The additional
      restriction is to avoid the need for escaping.

Again, see comments above.

   The following is an example of a typical MSRP URI:

      msrp://host.example.com:8493/asfd34;tcp

   6.1.  MSRP URI Comparison

It is very good to have such a section!

   MSRP URI comparisons MUST be performed according to the following
   rules:

The MUST is too strict. Section 6 in RFC 3986 discusses various
possibilities. What should be done is to change the above to say
"in the context of the MSRP protocol..." or some such.

   1.  The scheme MUST match.  Scheme comparison is case insensitive.

   2.  If the hostpart contains an explicit IP address, and/or port,
       these are compared for address and port equivalence.  Otherwise,
       hostpart is compared as a case insensitive character string.

   3.  If the port exists explicitly in either URI, then it MUST match
       exactly.  A URI with an explicit port is never equivalent to
       another with no port specified.

Oh, so it's okay to not have a port. I'm confused, sorry.

   4.  The session-id part is compared as case sensitive.  A URI without
       a session-id part is never equivalent to one that includes one.

   5.  URIs with different "transport" parameters never match.  Two URIs
       that are identical except for transport are not equivalent.  The
       transport parameter is case-insensitive.

   6.  Userinfo parts are not considered for URI comparison.

   Path normalization is not relevant for MSRP URIs.  Escape
   normalization is not required due to character restrictions in the
   formal syntax.

6.2.  Resolving MSRP Host Device

   An MSRP host device is identified by the hostport of an MSRP URI.

   If the hostport contains a numeric IP address and port, they MUST be
   used as listed.

   If the hostport contains a host name and a port, the connecting
   device MUST determine a host address by doing an A or AAAA DNS query,
   and use the port as listed.

   If a connection attempt fails, the device SHOULD attempt to connect
   to the addresses returned in any additional A or AAAA records, in the
   order the records were presented.

      This process assumes that the connection port is always known
      prior to resolution.  This is always true for the MSRP URI uses
      described in this document, that is, URIs exchanged in the SDP
      offer and answer.  The introduction of relays may create
      situations where this is not the case.  For example, the MSRP URI
      that a user enters into a client

The start of section 6 said that URIs are not for human consumption,
but now the user has to enter one.


                                       to configure it to use a relay
      may be intended to be easily remembered and communicated by
      humans, and therefore is likely to omit the port.  Therefore, the
      relay specification [23] may describe additional steps to resolve
      the port number.

It looks to me as if the relay specification either does or does not
describe additional steps. "may" doens't make sense to me here.

   MSRP devices MAY use other methods for discovering other such
   devices, when appropriate.  For example, MSRP endpoints may use other
   mechanisms to discover relays, which are beyond the scope of this
   document.


[from section 9]
   MSRP-URI = msrp-scheme "://" [userinfo "@"] hostport
       ["/" session-id] ";" transport *( ";" URI-parameter)
                        ; userinfo as defined in RFC3986, except
                        ; limited to unreserved.
                        ; hostport as defined in RFC3261

This should be changed to
                        ; hostport as defined in RFC3986

   msrp-scheme = "msrp" / "msrps"
   session-id = 1*( unreserved / "+" / "=" / "/" )
                        ; unreserved as defined in RFC3986

Does the "/" have the generally intended meaning (hierarchy
delimiter)? If not, it should not be used here (or should
be escaped).

   transport = "tcp" / ALPHANUM

See my comment above about ALPHANUM being a single letter/digit.

   URI-parameter = token ["=" token]

Regards,     Martin.



#-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst@it.aoyama.ac.jp     
Received on Friday, 29 December 2006 08:46:26 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 13 January 2011 12:15:36 GMT