Re: respecting IETF customs?

On 12/06/2014 01:40 PM, Sam Ruby wrote:
>
> If you take a survey of implementations, you will find that in addition
> to the outliers, there are two families of implementations.  One that
> collect around RFC 3986 are precise (in that they tend to produce the
> same results) but not necessary accurate in the face of IDNA and Unicode
> considerations.  And another that collect around browser results.  The
> latter is less precise (in that there are variations), but tend overall
> to be more accurate with respect to other applicable standards.

I've added Perl to my test results using the following program:

https://github.com/webspecs/url/blob/develop/evaluate/testuri.pl

It has been a while since I've programmed in Perl.  If there are things 
I missed, bugs in general, or even simply better ways of doing things, 
please let me know.

   - - -

I then took a look at the results, and believe that there being two 
families of implementations is more a matter of conventional wisdom; 
whereas reality isn't quite so clean.

Here's an example:

https://url.spec.whatwg.org/interop/urltest-results/683ac9869d

Looking at this, it doesn't look like addressable or rust do IDNA 
processing.  Rust at least fesses up to this. :-)

Node.js and Perl do less IDNA processing steps than other 
implementations.  In particular, they skip step 1, but do steps 2 and 3 
of the following page:

http://www.unicode.org/reports/tr46/#ToASCII

Everybody else does all 3 steps.  Note: this isn't necessarily because 
Node.js and Perl skipped a step, it may very well be that they implement 
an entirely different version of IDNA than everybody else does[1].

Chrome goes an extra step, and recognizes that the result is a IPv4, 
albeit one expressed in an uncommon way, and canonicalizes it.

On the theory that canonical URIs should round-trip; the current draft 
of the WebPlatform URL Specification aligns with Chrome on this, even 
though it is the only browser that exhibits this behavior.

   - - -

This is an example of the type of issue I'd like to explore with those 
interested in the topic of interoperable parsing behavior.

- Sam Ruby

[1] https://annevankesteren.nl/2012/11/idna-hell

Received on Sunday, 7 December 2014 04:10:03 UTC