Re: [url] Requests for Feedback (was Feedback from TPAC)

Interesting. Happy to update where there are mismatches to the abnf, of course. 

At first glance, it appears like a lot of the valid URI/invalid URL outcomes are because url LS is doing scheme-specific processing; is that the case? (Currently working with limited net access + heavy jet lag)

Sent from my iPhone

> On 23 Dec 2014, at 1:48 pm, Sam Ruby <rubys@intertwingly.net> wrote:
> 
> 
> 
>> On 12/23/2014 09:42 AM, Bjoern Hoehrmann wrote:
>> * Sam Ruby wrote:
>>>> On 12/22/2014 06:36 PM, Mark Nottingham wrote:
>>>> See also
>>>>    https://gist.github.com/mnot/138549
>>> 
>>> Thanks!
>>> 
>>> Here's a result I didn't expect:
>>> 
>>>> python uri_validate.py "http://user:password@example.com/"
>>>> testing: "http://user:password@example.com/"
>>>> URI: no
>>>> URI reference: no
>>>> Absolute URI: no
>>> 
>>> Am I doing something wrong?
>> 
>> Given
>> 
>>   #   reg-name      = *( unreserved / pct-encoded / sub-delims )
>>   reg_name = r"(?: %(unreserved)s | %(pct_encoded)s | %(sub_delims)s )*" % locals()
>> 
>>   #   userinfo      = *( unreserved / pct-encoded / sub-delims / ":" )
>>   userinfo = r"(?: %(unreserved)s | %(pct_encoded)s | %(sub_delims)s | : )" % locals()
>> 
>> I would say there is a missing `*` at the end of `userinfo`.
> 
> Thanks!
> 
> I believe I found a second problem with the following:
> 
>> #   IPv6address   =                            6( h16 ":" ) ls32
>> #                 /                       "::" 5( h16 ":" ) ls32
>> #                 / [               h16 ] "::" 4( h16 ":" ) ls32
>> #                 / [ *1( h16 ":" ) h16 ] "::" 3( h16 ":" ) ls32
>> #                 / [ *2( h16 ":" ) h16 ] "::" 2( h16 ":" ) ls32
>> #                 / [ *3( h16 ":" ) h16 ] "::"    h16 ":"   ls32
>> #                 / [ *4( h16 ":" ) h16 ] "::"              ls32
>> #                 / [ *5( h16 ":" ) h16 ] "::"              h16
>> #                 / [ *6( h16 ":" ) h16 ] "::"
>> IPv6address = r"""(?:                                  (?: %(h16)s : ){6} %(ls32)s |
>>                                                    :: (?: %(h16)s : ){5} %(ls32)s |
>>                                            %(h16)s :: (?: %(h16)s : ){4} %(ls32)s |
>>                         (?: %(h16)s : )    %(h16)s :: (?: %(h16)s : ){3} %(ls32)s |
>>                         (?: %(h16)s : ){2} %(h16)s :: (?: %(h16)s : ){2} %(ls32)s |
>>                         (?: %(h16)s : ){3} %(h16)s ::     %(h16)s :      %(ls32)s |
>>                         (?: %(h16)s : ){4} %(h16)s ::                    %(ls32)s |
>>                         (?: %(h16)s : ){5} %(h16)s ::                    %(h16)s  |
>>                         (?: %(h16)s : ){6} %(h16)s ::
>>                  )
>> """ % locals()
> 
> I believe the *n notation in ABNF corresponds to {0,n} in regular expressions.  A corrected version:
> 
>> IPv6address = r"""(?:                                    (?: %(h16)s : ){6} %(ls32)s |
>>                                                      :: (?: %(h16)s : ){5} %(ls32)s |
>>                                              %(h16)s :: (?: %(h16)s : ){4} %(ls32)s |
>>                         (?: %(h16)s : ){0,1} %(h16)s :: (?: %(h16)s : ){3} %(ls32)s |
>>                         (?: %(h16)s : ){0,2} %(h16)s :: (?: %(h16)s : ){2} %(ls32)s |
>>                         (?: %(h16)s : ){0,3} %(h16)s ::     %(h16)s :      %(ls32)s |
>>                         (?: %(h16)s : ){0,4} %(h16)s ::                    %(ls32)s |
>>                         (?: %(h16)s : ){0,5} %(h16)s ::                    %(h16)s  |
>>                         (?: %(h16)s : ){0,6} %(h16)s ::
>>                  )
>> """ % locals()
> 
> I've made these changes and updated the results:
> 
> http://intertwingly.net/tmp/urlvsuri.html
> 
> Looking at the first section, I believe that the URL Standard should change so that the conformance criteria for URLs becomes a strict subset of the URI validity criteria.  Accordingly, I've opened the following bug report:
> 
> https://www.w3.org/Bugs/Public/show_bug.cgi?id=27687
> 
> - Sam Ruby

Received on Tuesday, 23 December 2014 19:08:25 UTC