- From: Mark Nottingham <mnot@mnot.net>
- Date: Thu, 25 Dec 2014 05:00:41 -0500
- To: Sam Ruby <rubys@intertwingly.net>
- Cc: "public-ietf-w3c@w3.org" <public-ietf-w3c@w3.org>, "Martin J. Dürst" <duerst@it.aoyama.ac.jp>, "Julian F. Reschke" <julian.reschke@gmx.de>
I’ve added http_URI and https_URI, sourced from RFC7230, to filter out some of these false positives. https://gist.github.com/mnot/138549 > On 23 Dec 2014, at 2:47 pm, Sam Ruby <rubys@intertwingly.net> wrote: > > On 12/23/2014 02:07 PM, Mark Nottingham wrote: >> >> At first glance, it appears like a lot of the valid URI/invalid URL >> outcomes are because url LS is doing scheme-specific processing; is >> that the case? (Currently working with limited net access + heavy jet >> lag) > > That certainly explains a number of differences. Additionally: > > 1) There are cases that ABNF can't capture. I tend to agree with Julian[1] that the ABNF should be treated as rough syntax only, and that additional constraints should be specified in prose. That's effectively how the webplatform URL draft is structured[2]. > > 2) The URL LS is IDNA and Unicode more aware than RFC 3986 is. Clearly, this is by design, but I will suggest that there is an important lesson to be learned by the effort to split out RFC 3987 into a separate RFC: I think that unintentionally had the effect of "ghettoizing" IRIs. I might be misreading Martin, but perhaps that's why he suggested RFC 3986 errata as the way to handle bidi?[3] > > - Sam Ruby > > [1] http://lists.w3.org/Archives/Public/public-ietf-w3c/2014Dec/0079.html > [2] https://specs.webplatform.org/url/webspecs/develop/#parsing-rules > [3] http://www.ietf.org/mail-archive/web/apps-discuss/current/msg13516.html -- Mark Nottingham http://www.mnot.net/
Received on Thursday, 25 December 2014 10:01:09 UTC