- From: Sam Ruby <rubys@intertwingly.net>
- Date: Tue, 23 Dec 2014 13:48:18 -0500
- To: Bjoern Hoehrmann <derhoermi@gmx.net>
- CC: Mark Nottingham <mnot@mnot.net>, "public-ietf-w3c@w3.org" <public-ietf-w3c@w3.org>
On 12/23/2014 09:42 AM, Bjoern Hoehrmann wrote:
> * Sam Ruby wrote:
>> On 12/22/2014 06:36 PM, Mark Nottingham wrote:
>>> See also
>>> https://gist.github.com/mnot/138549
>>
>> Thanks!
>>
>> Here's a result I didn't expect:
>>
>>> python uri_validate.py "http://user:password@example.com/"
>>> testing: "http://user:password@example.com/"
>>> URI: no
>>> URI reference: no
>>> Absolute URI: no
>>
>> Am I doing something wrong?
>
> Given
>
> # reg-name = *( unreserved / pct-encoded / sub-delims )
> reg_name = r"(?: %(unreserved)s | %(pct_encoded)s | %(sub_delims)s )*" % locals()
>
> # userinfo = *( unreserved / pct-encoded / sub-delims / ":" )
> userinfo = r"(?: %(unreserved)s | %(pct_encoded)s | %(sub_delims)s | : )" % locals()
>
> I would say there is a missing `*` at the end of `userinfo`.
Thanks!
I believe I found a second problem with the following:
> # IPv6address = 6( h16 ":" ) ls32
> # / "::" 5( h16 ":" ) ls32
> # / [ h16 ] "::" 4( h16 ":" ) ls32
> # / [ *1( h16 ":" ) h16 ] "::" 3( h16 ":" ) ls32
> # / [ *2( h16 ":" ) h16 ] "::" 2( h16 ":" ) ls32
> # / [ *3( h16 ":" ) h16 ] "::" h16 ":" ls32
> # / [ *4( h16 ":" ) h16 ] "::" ls32
> # / [ *5( h16 ":" ) h16 ] "::" h16
> # / [ *6( h16 ":" ) h16 ] "::"
> IPv6address = r"""(?: (?: %(h16)s : ){6} %(ls32)s |
> :: (?: %(h16)s : ){5} %(ls32)s |
> %(h16)s :: (?: %(h16)s : ){4} %(ls32)s |
> (?: %(h16)s : ) %(h16)s :: (?: %(h16)s : ){3} %(ls32)s |
> (?: %(h16)s : ){2} %(h16)s :: (?: %(h16)s : ){2} %(ls32)s |
> (?: %(h16)s : ){3} %(h16)s :: %(h16)s : %(ls32)s |
> (?: %(h16)s : ){4} %(h16)s :: %(ls32)s |
> (?: %(h16)s : ){5} %(h16)s :: %(h16)s |
> (?: %(h16)s : ){6} %(h16)s ::
> )
> """ % locals()
I believe the *n notation in ABNF corresponds to {0,n} in regular
expressions. A corrected version:
> IPv6address = r"""(?: (?: %(h16)s : ){6} %(ls32)s |
> :: (?: %(h16)s : ){5} %(ls32)s |
> %(h16)s :: (?: %(h16)s : ){4} %(ls32)s |
> (?: %(h16)s : ){0,1} %(h16)s :: (?: %(h16)s : ){3} %(ls32)s |
> (?: %(h16)s : ){0,2} %(h16)s :: (?: %(h16)s : ){2} %(ls32)s |
> (?: %(h16)s : ){0,3} %(h16)s :: %(h16)s : %(ls32)s |
> (?: %(h16)s : ){0,4} %(h16)s :: %(ls32)s |
> (?: %(h16)s : ){0,5} %(h16)s :: %(h16)s |
> (?: %(h16)s : ){0,6} %(h16)s ::
> )
> """ % locals()
I've made these changes and updated the results:
http://intertwingly.net/tmp/urlvsuri.html
Looking at the first section, I believe that the URL Standard should
change so that the conformance criteria for URLs becomes a strict subset
of the URI validity criteria. Accordingly, I've opened the following
bug report:
https://www.w3.org/Bugs/Public/show_bug.cgi?id=27687
- Sam Ruby
Received on Tuesday, 23 December 2014 18:48:46 UTC