Re: Adding user@ to HTTP[S] URIs from Amos Jeffries on 2020-01-25 (ietf-http-wg@w3.org from January to March 2020)

From: Amos Jeffries <squid3@treenet.co.nz>
Date: Sun, 26 Jan 2020 01:38:55 +1300
To: ietf-http-wg@w3.org
Message-ID: <0bb7f153-57ea-7cb4-59e2-26ee2e41d928@treenet.co.nz>
On 26/01/20 12:02 am, Rick van Rein wrote:
> Hi Michael,
> 
> Thanks for your positive response.
> 
>>> Most protocols support users under domain names, but HTTP does not.
>>

This is incorrect. userinfo@ is part of the generic URI format, so it
must still be supported by http:// URL parsers. The difference is that
the client is expected to map the section into data sent in another way.

A client is forbidden from *sending* the userinfo@ on-wire.
Implementation differences in remote parser is where problems start
occuring.


>> Well, it *does* support users within the "path" part of the URL.  For instance, here's a page I just made for you, that's scoped to my user account:
>>
>>     https://invisible.college/@toomim/hello-rick
> 
> These patterns are common, examples below, and that's why I believe that
> we should support mapping users into the HTTP space.  It is useful if
> the pattern can be consistent among servers, and in comparison with
> other protocols, I think.  HTTP is missing that part of URL syntax.
> 
> Having a place to specify user name syntax and semantics is a good
> example.  This can help to squash numerous attacks that may be tried
> with the generic path-based format that you are showing.  We can then
> restrict the grammar to that of a utf8-username in RFC 7542 and thus
> exclude spaces, ":" and "@" and other junk and have it enforced (!) at
> the HTTP level instead of in scripted applications of varying quality.
> 

Those rules already exist for the initiel userinfo@ section. The core of
all these issues is that they are not being obeyed by software in reality.

Also please be aware that the latest HTTP versions 2+ do not send their
URL as a single text string. It is broken down into a set of fields sent
in headers. So the parsing related issues are no longer such a bad
concern there. However that also means that any new URI part must be
sent as a header anyway - which is no different in HTTP than pulling
username from a (Proxy-)Authorization header.



>>> Usage patterns in the wild do suggest a desire to have this facility.
>>
>> I didn't see any example usage patterns in the internet draft.  Can you provide some of them, so that we know what we are working with?
> 
> There are many examples of the URL-mapped form like you proposed, and
> they seem to be telling that people (or groups) want to represent their
> online identity in an HTTP URL.  They cannot be interpreted as user
> names, and code to access it ends up with in-situ coding.
> 
> Conventionally structured mapping,
>  https://www.cabrillo.edu/~rnolthenius/
> 
> Site-specific structure,
>  https://nlnet.nl/people/leenaars.html
>  https://people.utwente.nl/m.vankeulen
>  https://www.facebook.com/dssvtartaros/
> 
> Unstructured mappings,
>  http://catb.org/esr/
>  http://rick.vanrein.org
> 
> These could be consistently represented as
>  https://rnothenius@www.cabrillo.edu
>  https://leenaars@nlnet.nl
>  https://m.vankeulen@people.utwente.nl
>  https://dssvtartaros@www.facebook.com
>  http://esr@catb.org/esr
>  http://rick@vanrein.org
> 

These are valid URLs today - all have the same value:

1) the typically seen absolute URL:

   http://example.com@localhost.local/

2) path delimiter is optional for empty paths

   http://example.com@localhost.local

3) relative hostname is valid in some DNS domains

    http://exmaple.com@localhost

4) dotless cc-TLS is also valid hostname

    http://exmaple.com@local

5) on-wire URLs:

   http://local
   http://localhost
   http://localhost.local




> I pioneered this idea with a crude hack based on Basic authentication,
> which is highly inconsistent across browsers because Basic and Digegst
> have always misinterpreted the URL userinfo as authentication names,

That is not a misinterpretation. The content is site-specific. If a site
has URLs with userinfo@ *and* challenges for authentication credentials,
then it is reasonable for the field to be interpreted as the answer to
the challenge.


Amos
Received on Saturday, 25 January 2020 12:40:32 UTC