- From: Mark Nottingham <mnot@mnot.net>
- Date: Tue, 23 Dec 2014 14:07:52 -0500
- To: Sam Ruby <rubys@intertwingly.net>
- Cc: Bjoern Hoehrmann <derhoermi@gmx.net>, "public-ietf-w3c@w3.org" <public-ietf-w3c@w3.org>
Interesting. Happy to update where there are mismatches to the abnf, of course. At first glance, it appears like a lot of the valid URI/invalid URL outcomes are because url LS is doing scheme-specific processing; is that the case? (Currently working with limited net access + heavy jet lag) Sent from my iPhone > On 23 Dec 2014, at 1:48 pm, Sam Ruby <rubys@intertwingly.net> wrote: > > > >> On 12/23/2014 09:42 AM, Bjoern Hoehrmann wrote: >> * Sam Ruby wrote: >>>> On 12/22/2014 06:36 PM, Mark Nottingham wrote: >>>> See also >>>> https://gist.github.com/mnot/138549 >>> >>> Thanks! >>> >>> Here's a result I didn't expect: >>> >>>> python uri_validate.py "http://user:password@example.com/" >>>> testing: "http://user:password@example.com/" >>>> URI: no >>>> URI reference: no >>>> Absolute URI: no >>> >>> Am I doing something wrong? >> >> Given >> >> # reg-name = *( unreserved / pct-encoded / sub-delims ) >> reg_name = r"(?: %(unreserved)s | %(pct_encoded)s | %(sub_delims)s )*" % locals() >> >> # userinfo = *( unreserved / pct-encoded / sub-delims / ":" ) >> userinfo = r"(?: %(unreserved)s | %(pct_encoded)s | %(sub_delims)s | : )" % locals() >> >> I would say there is a missing `*` at the end of `userinfo`. > > Thanks! > > I believe I found a second problem with the following: > >> # IPv6address = 6( h16 ":" ) ls32 >> # / "::" 5( h16 ":" ) ls32 >> # / [ h16 ] "::" 4( h16 ":" ) ls32 >> # / [ *1( h16 ":" ) h16 ] "::" 3( h16 ":" ) ls32 >> # / [ *2( h16 ":" ) h16 ] "::" 2( h16 ":" ) ls32 >> # / [ *3( h16 ":" ) h16 ] "::" h16 ":" ls32 >> # / [ *4( h16 ":" ) h16 ] "::" ls32 >> # / [ *5( h16 ":" ) h16 ] "::" h16 >> # / [ *6( h16 ":" ) h16 ] "::" >> IPv6address = r"""(?: (?: %(h16)s : ){6} %(ls32)s | >> :: (?: %(h16)s : ){5} %(ls32)s | >> %(h16)s :: (?: %(h16)s : ){4} %(ls32)s | >> (?: %(h16)s : ) %(h16)s :: (?: %(h16)s : ){3} %(ls32)s | >> (?: %(h16)s : ){2} %(h16)s :: (?: %(h16)s : ){2} %(ls32)s | >> (?: %(h16)s : ){3} %(h16)s :: %(h16)s : %(ls32)s | >> (?: %(h16)s : ){4} %(h16)s :: %(ls32)s | >> (?: %(h16)s : ){5} %(h16)s :: %(h16)s | >> (?: %(h16)s : ){6} %(h16)s :: >> ) >> """ % locals() > > I believe the *n notation in ABNF corresponds to {0,n} in regular expressions. A corrected version: > >> IPv6address = r"""(?: (?: %(h16)s : ){6} %(ls32)s | >> :: (?: %(h16)s : ){5} %(ls32)s | >> %(h16)s :: (?: %(h16)s : ){4} %(ls32)s | >> (?: %(h16)s : ){0,1} %(h16)s :: (?: %(h16)s : ){3} %(ls32)s | >> (?: %(h16)s : ){0,2} %(h16)s :: (?: %(h16)s : ){2} %(ls32)s | >> (?: %(h16)s : ){0,3} %(h16)s :: %(h16)s : %(ls32)s | >> (?: %(h16)s : ){0,4} %(h16)s :: %(ls32)s | >> (?: %(h16)s : ){0,5} %(h16)s :: %(h16)s | >> (?: %(h16)s : ){0,6} %(h16)s :: >> ) >> """ % locals() > > I've made these changes and updated the results: > > http://intertwingly.net/tmp/urlvsuri.html > > Looking at the first section, I believe that the URL Standard should change so that the conformance criteria for URLs becomes a strict subset of the URI validity criteria. Accordingly, I've opened the following bug report: > > https://www.w3.org/Bugs/Public/show_bug.cgi?id=27687 > > - Sam Ruby
Received on Tuesday, 23 December 2014 19:08:25 UTC