Re: [url] Requests for Feedback (was Feedback from TPAC)

On 2014-12-22 14:43, Sam Ruby wrote:
> On 12/22/2014 08:20 AM, Julian Reschke wrote:
>> On 2014-12-22 14:04, Sam Ruby wrote:
>>> On 12/22/2014 03:29 AM, Julian Reschke wrote:
>>>> On 2014-12-21 22:10, Sam Ruby wrote:
>>>>> ...
>>>>> I'll simply make the observation unless there is some movement at the
>>>>> IETF that the risks will only increase over time.
>>>>>
>>>>> This is NOT an ultimatum.  There isn't a a point at time where a
>>>>> go/no-go decision needs to be made.  But given the lack of
>>>>> demonstrable
>>>>> progress in the last 90 or so days, I would suggest that there be a
>>>>> cause for concern.
>>>>> ...
>>>>
>>>> Sam, if you want to see something happen inside the IETF, the rifht
>>>> thing to do is to start that work inside the IETF. And if you believe
>>>> that something is incorrect in RFC 3986, the best way to make progress
>>>> is to actually state what's wrong. And again, what's mostly interesting
>>>> is not what RFC 3986 does *not* say (such as handling broken references
>>>> from markup etc), but what it *does* say and gets wrong.
>>>
>>> What's interesting to different people varies.
>>
>> It's interesting for everybody who has to decide whether it's time to
>> update RFC 3986 or not.
>>
>>> A concrete example of a problem with RFC 3986 is the lack of addressing
>>> IDNA processing.
>>
>> How is IDNA processing relevant to URIs (being restricted to US-ASCII)?
>>
>>> It is entirely possible that handling of broken references can be
>>> handled outside of RFC 3986.  Other changes (IDNA, UTF-8, interop issues
>>> on valid URIs) are best handled either as updates to RFC 3986 or in a
>>> spec that replaces it.
>>
>> Of these points, only one is relevant to RFC 3986 (interop issues on
>> valid URIs).
>>
>>> I encourage you to go to this page:
>>>
>>> https://url.spec.whatwg.org/interop/test-results/
>>>
>>> Select the option to show only valid inputs, and then propose specific
>>> changes.  Note: that input could very well be to mark a number of these
>>> inputs as invalid.
>>
>> I looked at test 0, it's labelled valid while it's invalid.
>>
>> Same for tests 3, 4, 5, 6, 12, and likely many more.
>>
>> Validity according to RFC 3986 can be mechanically checked; why do we
>> need to "mark" something here?
>
> If there is a program I can use to mechanically check for RFC 3986
> compliance and shows how a given URI is to be interpreted (scheme, host,
> path, query, fragment, etc.), I'll gladly update my results.

RFC 3986 has a regexp that's expected to parse valid URIs consistent 
with the ABNF; see 
<http://greenbytes.de/tech/webdav/rfc3986.html#rfc.section.B>.

To change that to a validity checker, we probably just need to restrict 
the character classes so that non-ASCII characters never match.

(I can give this a try over the next week(

Best regards, Julian

Received on Monday, 22 December 2014 13:51:19 UTC