Re: URI Syntax Limitations from Manuel Urueña on 2005-11-15 (uri@w3.org from November 2005)

From: Manuel Urueña <muruenya@it.uc3m.es>
Date: Tue, 15 Nov 2005 13:43:00 +0100
To: "Roy T. Fielding" <fielding@gbiv.com>
Cc: uri@w3.org
Message-Id: <1132058580.6620.74.camel@requiem.it.uc3m.es>
Comments inline


El mié, 09-11-2005 a las 15:50 -0800, Roy T. Fielding escribió:
> On Nov 9, 2005, at 4:38 AM, Manuel Urueña wrote:
> 
> > While studying the deployment of new protocols in Internet, I've found
> > some limitations to current URI syntax:
> >
> > - Although URIs can include a port number, the transport protocol to be
> > used cannot be specified (i.e. UDP or TCP in DNS). Thus, each URI 
> > scheme
> > is bound to a single transport protocol. This limitation could hinder
> > the usage of newer protocols like SCTP in current applications (e.g.
> > HTTP over SCTP).
> 
> It isn't a limitation. Authorities are bound by transport protocol
> as well, so there is no value in treating them as the same space.
> It is simply a choice as to whether the junction is part of the
> authority syntax or the scheme syntax.  Since URIs were deployed
> long before anyone considered that question, the choice has
> narrowed to different schemes.  In other words, just define
> "http.sctp" and "http.udp" schemes, if so desired.

Yes http.sctp does work, however it seems harder to read than, let say
sctp.80 in the port section, albeit this could my personal taste, of
course.

Also, as the scheme part is parsed by each application, adding new
"fields" could be more complex (e.g. "http" = "http.tcp" = "http.dns" =
"http.dns.tcp" <> "http.rserpool" = "http.tcp.rserpool", ...).

> The technical advantage of them being different schemes is because
> applications will need a different protocol module to resolve them.

That's true in most cases, although in some migration scenarios the new
protocols try to emulate the previous API, thus the application can
maintain the same system calls.

For example, for one-to-one communication the SCTP socket API is exactly
the same tan the TCP one. It is only necessary to replace:

int socket(PF_INET, SOCK_STREAM, IPPROTO_TCP);

by 

int socket(PF_INET, SOCK_STREAM, IPPROTO_SCTP);
                                 ^^^^^^^^^^^^
> Since well-designed URI applications use a dispatch-on-scheme
> table for handlers, it is technically superior to place that info
> in the scheme name instead of inside the authority component.

Well, if the URI syntax would define a (optional) Transport ID part
inside the authority section the URI parser libraries could return the
token or to the application, and then employ a dispatch-on-transport
mechanism, or just return the appropriate IPPROTO_* value, thus the
application could easily change between TCP or SCTP on demand.

> > - Host identification is limited to plain IP addesses (no IPV6 scope 
> > id)
> > or DNS-like hostnames. Therefore, although each scheme could define an
> > alternative resolution mechanism for the "host" part, this limitation
> > could also hinder the deployment of newer Service Discovery (e.g. SLP)
> > or Load Balancing (e.g. Rserpool) protocols, that offer some kind of
> > alternative name-resolution mechanism.
> 
> No such limitation exists, so my guess is that you are assuming
> something that isn't said in STD 66.

Sorry, my mistake, RFC 3986 only says DNS is the most "common" registry
mechanism... I incorrectly assume it as mandatory because it is
widespread.

However, as each scheme is bound to a single registry mechanism, this
has some drawbacks, even if using an extended scheme like "http.dns".
Please let me elaborate a little more.

A possible problem comes from collapsing an entire name resolving system
within a single token. For example http.dns://www.example.com could mean
"ask for the AAAA or A Resource Record of www.example.com". But then DNS
service discovery cannot be employed in URIs because
"http.dns://_www._tcp.example.com" would not query the SRV RR, unless a
new "http.dns_srv" scheme is added, but there would be too many schemes
to cover all the possible variations. 

Moreover, an advantage of including a name-resolving id in the registry
part is that it could be transparently processed by the OS, as basically
applications should not bother what name resolving mechanism is
employed, as far as it gets a valid IP address and the target port
number.

On the other hand, with a simple scheme-handler mechanism, each
application should deal with different service discovery libraries, and
be modified to support each new service discovery mechanism. If the
registry id were resolved by the OS, it would be necessary just to add a
plugin or patch it and all the applications could benefit from the
newest service discovery protocol.


> > Browsing the mailing list archive I've seen that some newer protocols
> > define some arguments in the query part (like "transport=SCTP") in 
> > their
> > URI formats in order to cope with the first issue.
> >
> > However IMHO this mechanism could not be applied to all URI schemes
> > already defined, as the "query" part is optional, thus many schemes do
> > not allow any arguments.
> 
> I don't know of any worthwhile schemes that have done something
> as goofy as placing a transport protocol in the query part.
> All such proposals in the past have been told to use the scheme
> name, as in http/https. The only reason you don't see http.sctp
> in the registry is because nobody has asked for it yet.

Best regards,
--Manuel

-- 
Manuel Uruen~a - Universidad Carlos III de Madrid
GPG FP: C20B 7F07 09E3 FB95 7AD9  D03A DA93 AA09 4EE2 675B
http://www.it.uc3m.es/netcom
Received on Tuesday, 15 November 2005 12:43:11 UTC