Re: Browsers and .onion names from Matthew Kerwin on 2015-11-30 (ietf-http-wg@w3.org from October to December 2015)

From: Matthew Kerwin <matthew@kerwin.net.au>
Date: Mon, 30 Nov 2015 19:52:09 +1000
To: Eliot Lear <lear@cisco.com>
Cc: Alex Rousskov <rousskov@measurement-factory.com>, HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <CACweHNCKk_M2cEW3to-JstMyO5v7HVTfWYtr3pU1Gub=9xjB_g@mail.gmail.com>

On 30 November 2015 at 17:55, Eliot Lear <lear@cisco.com> wrote:

>
> And so to this point:
>
> Personally, I'd **really** prefer the Web not to be locked into one address resolution protocol (especially when you look at how problematic our current solution can be).
>
>
> While "locked into" might not be what I want either, the benefit of using
> a single address resolution protocol is that there is inherent
> consistency.  Those publishing a name do so in one way for which it is
> assumed that the client understands.  That is quite powerful.  The
> limitation is that it is *incredibly* hard to evolve that mechanism due to
> its ossification.  They are yin and yang.  If that sounds like I'm divided
> on this issue, you've then you've understood my meaning.  I would suggest
> it is not black and white and that there are tradeoffs.
>
> So... pragmatically, what is appropriate?
> 
>
>
Looking at this from the outside, my question is: how do you know how to
resolve a "host"? We've said this:

   If the host identifier is provided as an IP address, the origin
   server is the listener (if any) on the indicated TCP port at that IP
   address.  If host is a registered name, the registered name is an
   indirect identifier for use with a name resolution service, such as
   DNS, to find an address for that origin server.

As an application author, that seems to use the same sort of words and
provides the same sort of level of detail as getaddrinfo, my man page for
which says:

   ... "node" specifies either a numerical network
   address (for IPv4, numbers-and-dots notation as supported by
   inet_aton(3); for IPv6, hexadecimal string format as supported by
   inet_pton(3)), or a network hostname, whose network addresses are
   looked up and resolved.

It never says *how* they're looked up and resolved, the same way RFC 7230
never says how we're supposed to. Perfect! Obviously, if I was writing
a tor thing I'd probably go out of my way to detect .onion addresses and do
special magic; but I'm just writing a boring old HTTP library. I've never
even heard of tor. That's something to do with Frodo and Gollum, isn't it?

If I had faith in my subsystems, I'd assume that getaddrinfo knows
the special doodads for which to watch when given a non-IP-address "node",
and thus know how to resolve it using the appropriate whatchermacallits.
(Similarly to how it already knows that "1.2.3.4" and "[1::2]" are IP
addresses.) So, should we be filing bug reports against Linux, to update
its getaddrinfo syscall?

Aside: I still find it kind of funny that we have an RFC that basically
says: If you implement this spec and the Tor protocol, do this; however if
you implement this spec but not the Tor protocol, do that. (And the
implied: If you don't implement this spec, ¿lol?)

Not wrong, per se, just kind of funny.

Cheers
-- 
  Matthew Kerwin
  http://matthew.kerwin.net.au/

Received on Monday, 30 November 2015 09:52:40 UTC