Re: [url] Requests for Feedback (was Feedback from TPAC)

On 12/24/2014 11:47 AM, Roy T. Fielding wrote:
> On Dec 23, 2014, at 11:47 AM, Sam Ruby <rubys@intertwingly.net>
> wrote:
>>
>>> On 12/23/2014 02:07 PM, Mark Nottingham wrote:
>>>
>>> At first glance, it appears like a lot of the valid URI/invalid
>>> URL outcomes are because url LS is doing scheme-specific
>>> processing; is that the case? (Currently working with limited net
>>> access + heavy jet lag)
>>
>> That certainly explains a number of differences.  Additionally:
>>
>> 1) There are cases that ABNF can't capture.  I tend to agree with
>> Julian[1] that the ABNF should be treated as rough syntax only, and
>> that additional constraints should be specified in prose.  That's
>> effectively how the webplatform URL draft is structured[2].
>>
>> 2) The URL LS is IDNA and Unicode more aware than RFC 3986 is.
>> Clearly, this is by design, but I will suggest that there is an
>> important lesson to be learned by the effort to split out RFC 3987
>> into a separate RFC: I think that unintentionally had the effect of
>> "ghettoizing" IRIs.  I might be misreading Martin, but perhaps
>> that's why he suggested RFC 3986 errata as the way to handle
>> bidi?[3]
>
> No, it has to be understood that RFC3986 defines the set of addresses
> that are universally interoperable. IDNA is not INTEROPERABLE except
> in its punycode form.

It is not clear to me what you are saying 'No' to.  Even ASCII only URIs 
have never been universally INTEROPERABLE (just curious: why are we 
shouting here?).  I have data that demonstrates considerable IDNA 
interoperability, though clearly not universally.

I'm personally willing to settle for "rough consensus and running code".

> This is an entirely different problem than parsing arbitrary
> references so that they can be transformed into a URL, just as it is
> an entirely different problem to define the URL DOM API. Neither of
> those would have made IETF Standard because there was no single
> agreement on what to do. The best we could do was an appendix.

I'm inclined to believe that the amount of consensus and running code 
may be different in 2015 than it was in 2005.

> The problem with RFC3987 was that it tried to define a new addressing
> format instead of simply defining an arbitrary reference and how to
> get from there to an interoperable URI. It did not work because it
> wasn't written to handle arbitrary input and could not keep up with
> changes in IDNA.
>
> As I said when this ruckus started years ago, all that HTML needs is
> a specification for how to parse references and another for how to
> fill the URL DOM. Those are HTML concerns. The notion that 3986 had
> to be replaced is nothing more than ignorance combined with the
> arrogant way that HTML5 has been allowed to piss all over the rest of
> Web standards.

It looks to me that you are allowing your emotions to cloud your 
judgment here.

I have data[1] that shows that ASCII only RFC 3986 valid URIs are not 
fully interoperable today.  I am working on conformance rules and new 
parsing rules that better match implementations.  I am looking not just 
at browsers, but at a variety of libraries.  I welcome contributions of 
scripts and programs that explore even more libraries.

It might be possible for us to split that effort into two parts.  One 
part would either be an errata for RFC3986 or an RFC3986bis.  The other 
would be layered on top of that.

If turns out that this doesn't happen for whatever reason (technical or 
political, it matters not), then the URL standard will simply be a more 
up to date and better description as to how things actually work.

> ....Roy

- Sam Ruby

[1] https://url.spec.whatwg.org/interop/test-results/?filter=valid

Received on Wednesday, 24 December 2014 17:06:28 UTC