Re: [url] Requests for Feedback (was Feedback from TPAC)

On 12/23/2014 09:42 AM, Bjoern Hoehrmann wrote:
> * Sam Ruby wrote:
>> On 12/22/2014 06:36 PM, Mark Nottingham wrote:
>>> See also
>>>     https://gist.github.com/mnot/138549
>>
>> Thanks!
>>
>> Here's a result I didn't expect:
>>
>>> python uri_validate.py "http://user:password@example.com/"
>>> testing: "http://user:password@example.com/"
>>> URI: no
>>> URI reference: no
>>> Absolute URI: no
>>
>> Am I doing something wrong?
>
> Given
>
>    #   reg-name      = *( unreserved / pct-encoded / sub-delims )
>    reg_name = r"(?: %(unreserved)s | %(pct_encoded)s | %(sub_delims)s )*" % locals()
>
>    #   userinfo      = *( unreserved / pct-encoded / sub-delims / ":" )
>    userinfo = r"(?: %(unreserved)s | %(pct_encoded)s | %(sub_delims)s | : )" % locals()
>
> I would say there is a missing `*` at the end of `userinfo`.

Thanks!

I believe I found a second problem with the following:

> #   IPv6address   =                            6( h16 ":" ) ls32
> #                 /                       "::" 5( h16 ":" ) ls32
> #                 / [               h16 ] "::" 4( h16 ":" ) ls32
> #                 / [ *1( h16 ":" ) h16 ] "::" 3( h16 ":" ) ls32
> #                 / [ *2( h16 ":" ) h16 ] "::" 2( h16 ":" ) ls32
> #                 / [ *3( h16 ":" ) h16 ] "::"    h16 ":"   ls32
> #                 / [ *4( h16 ":" ) h16 ] "::"              ls32
> #                 / [ *5( h16 ":" ) h16 ] "::"              h16
> #                 / [ *6( h16 ":" ) h16 ] "::"
> IPv6address = r"""(?:                                  (?: %(h16)s : ){6} %(ls32)s |
>                                                     :: (?: %(h16)s : ){5} %(ls32)s |
>                                             %(h16)s :: (?: %(h16)s : ){4} %(ls32)s |
>                          (?: %(h16)s : )    %(h16)s :: (?: %(h16)s : ){3} %(ls32)s |
>                          (?: %(h16)s : ){2} %(h16)s :: (?: %(h16)s : ){2} %(ls32)s |
>                          (?: %(h16)s : ){3} %(h16)s ::     %(h16)s :      %(ls32)s |
>                          (?: %(h16)s : ){4} %(h16)s ::                    %(ls32)s |
>                          (?: %(h16)s : ){5} %(h16)s ::                    %(h16)s  |
>                          (?: %(h16)s : ){6} %(h16)s ::
>                   )
> """ % locals()

I believe the *n notation in ABNF corresponds to {0,n} in regular 
expressions.  A corrected version:

> IPv6address = r"""(?:                                    (?: %(h16)s : ){6} %(ls32)s |
>                                                       :: (?: %(h16)s : ){5} %(ls32)s |
>                                               %(h16)s :: (?: %(h16)s : ){4} %(ls32)s |
>                          (?: %(h16)s : ){0,1} %(h16)s :: (?: %(h16)s : ){3} %(ls32)s |
>                          (?: %(h16)s : ){0,2} %(h16)s :: (?: %(h16)s : ){2} %(ls32)s |
>                          (?: %(h16)s : ){0,3} %(h16)s ::     %(h16)s :      %(ls32)s |
>                          (?: %(h16)s : ){0,4} %(h16)s ::                    %(ls32)s |
>                          (?: %(h16)s : ){0,5} %(h16)s ::                    %(h16)s  |
>                          (?: %(h16)s : ){0,6} %(h16)s ::
>                   )
> """ % locals()

I've made these changes and updated the results:

http://intertwingly.net/tmp/urlvsuri.html

Looking at the first section, I believe that the URL Standard should 
change so that the conformance criteria for URLs becomes a strict subset 
of the URI validity criteria.  Accordingly, I've opened the following 
bug report:

https://www.w3.org/Bugs/Public/show_bug.cgi?id=27687

- Sam Ruby

Received on Tuesday, 23 December 2014 18:48:46 UTC