W3C home > Mailing lists > Public > www-archive@w3.org > August 2011

Re: Randomized testing of URI processing behavior

From: Bjoern Hoehrmann <derhoermi@gmx.net>
Date: Tue, 02 Aug 2011 17:35:49 +0200
To: Julian Reschke <julian.reschke@gmx.de>
Cc: www-archive@w3.org
Message-ID: <gu4g37lp0a7mpkjimafpsidjfrhjvbe0sb@hive.bjoern.hoehrmann.de>
* Julian Reschke wrote:
>> I also noticed that RFC 3986's transformation algorithm is buggy, there
>> is a `merge` function that's invoked with two paths, but then the merge
>> function actually needs to know whether there was an authority with one
>> of the references, which is not passed to the function.

Note that there is an error in my merge function, it should be

  sub merge {
    my ($base, $ref, $base_has_authority) = @_;
    return "/$ref" if $base eq "" and $base_has_authority;
    return "$1$ref" if $base =~ m|^(.*?)([^/]*)$|s;
  }

>Indeed.
>
>So, in 
><http://greenbytes.de/tech/webdav/rfc3986.html#rfc.section.5.2.2>, it 
>would need to say...:
>
>	T.path = merge(Base.authority, Base.path, R.path);
>
>instead of
>
>	T.path = merge(Base.path, R.path);

I would probably prefer `merge(Base, R)` as passing a parameter but then
not using it other than checking whether it's defined strikes me as poor
form, but yeah, that seems to be the idea. It's also possible that this
is not meant to depend on there being an authority. For instance,

  % perl -MURI -e "print URI->new_abs('3', 'example:')"
  example:/3

That is what you get if you ignore "has a defined authority component"
(it's possible RFC 3986 changed this, URI.pm is more based on RFC 2396);
with the revised merge function and by my reading the correct result is
"example:3" which seems to make more sense. Another case would be

  % perl -MURI -e "print URI->new_abs('//example.com', 'example:')"
  example://example.com

So you get an authority by merging a scheme-only base with a path-only
reference. That's not terribly intuitive, but then you should never get
a base lacking an authority for a scheme that uses authority this way,
like you wouldn't have a base of "http:", and there is

  % perl -MURI -e "print URI->new_abs('3', 'javascript://')"
  javascript:///3

where adding the third slash seems a bit odd, but is the correct result
and probably not very avoidable if you want this to work without scheme-
specific knowledge. So channging it to `merge(Base, R)` seems best.

>Maybe this should be entered as erratum.

I'd think so; I guess I'll give the errata tool a try if I don't hear
any objections soon.
-- 
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de
25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/ 
Received on Tuesday, 2 August 2011 15:36:16 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 14:43:49 UTC