Re: Soliciting feedback on draft-abarth-url

We're getting side-tracked on minutiae about the test suite that I
happen to be using to reverse-engineer the behavior of browsers.  I'm
happy to continue discussing that, but I'd prefer to discuss actual
technical things related to processing URLs.

On Tue, Apr 19, 2011 at 6:08 PM, "Martin J. Dürst"
<duerst@it.aoyama.ac.jp> wrote:
> On 2011/04/20 2:37, Adam Barth wrote:
>> On Tue, Apr 19, 2011 at 10:26 AM, Julian Reschke<julian.reschke@gmx.de>
>>  wrote:
>>> On 18.04.2011 07:51, Adam Barth wrote:
>>>> Greetings public-iri,
>>>>
>>>> I'm starting to actively edit draft-abarth-url again.  If you have any
>>>> technical feedback on the document, please let me know.  Particularly
>>>> useful is feedback in the form of a test case that can be added to
>>>> this test suite:
>>>>
>>>> http://trac.webkit.org/browser/trunk/LayoutTests/fast/url/
>>>>
>>>> As an example, the following test suite shows how a number of
>>>> different sequences of characters are segmented into components:
>>>>
>>>>
>>>>
>>>> http://trac.webkit.org/export/HEAD/trunk/LayoutTests/fast/url/segments.html
>>>>
>>>> Test cases are especially helpful because they allow us to compare the
>>>> behavior of different user agents and will ensure that the net result
>>>> of this process is interoperable behavior.
>>>>
>>>> You can always find the latest version of the draft at this location:
>>>>
>>>> https://github.com/abarth/url-spec/blob/master/drafts/url.xml
>>>>
>>>> I'm not soliciting editorial or presentational feedback at this time.
>>>> If you have editorial or presentational feedback, I'd ask that you
>>>> wait until we've fleshed out the test suite and core technical content
>>>> of the document.
>>>
>>> Here's an observation: FF4 fails most of these tests, IE9 fails all of
>>> them.
>>> So whatever these tests test is not relevant in practice.
>>
>> I should have mentioned that the PASS/FAIL judgements of these tests
>> are set somewhat arbitrarily.  The test exist to probe behavior.  It's
>> our job to look at the behavior and reach judgements about them.
>
> It's good to mention this when announcing tests.

Yep, I should have mentioned it.

> It's even better to say so
> clearly in the test material (e.g. in the result page). There is much other
> information that would be extremely helpful on the result page, such as what
> exactly is being tested (not just "Test URL segmentation", but URL
> segmentation using the browser DOM), who created the tests, what date the
> last change was made, where to send results, and so on.
>
> As for the test results, the best thing to do in this case is just not at
> all use words such as PASS or FAIL, and also not use green and red as
> colors. Maybe just using FOO and BAR? Or using some descriptive words, such
> as DEFAULT-PORT-IS-ZERO or DEFAULT-PORT-IS-EMPTY? I tried to come up with a
> good example, but found another problem: The tests all test all six parts of
> the decomposition (scheme, host, port, path, query part, fragment). That
> means that a single issue (e.g. the fact that Opera (11.01, just in case it
> matters) uses the empty string for a default port, rather than the "0" that
> the tests expect (who came up with that?) essentially makes the whole test
> results useless. Opera only "passes" the first test and the last test (that
> one, as far as I understand, being a metatest indicating that the test run
> as such was successful).
>
> The test results are essentially useless both in terms of evaluating an
> implementation (if I assume for a moment that PASS would indeed be the
> 'right' thing to do, then Opera would be judged as a complete failure) and
> in terms of discussing issues for a spec (because any potential
> correspondence between issues and tests, if it exists, is hidden).
>
> So I strongly suggest to move towards testing individual items in the
> parsing results, and labeling the test results with descriptive terms rather
> than things that look like value judgements.

If you desire those things, you should feel free to create your own
URL parsing test suite.  This test suite happens to be one that I've
ported from a bunch of GURL unit test to be able to run in web
browsers.  I've also cross-referenced and researched all the behaviors
tested in that suite.

As for the colors and other minutiae, it's just using the normal
WebKit test harness, but the tests should be easy to port to whatever
test harness you desire.

>>> As far as I understand, these tests use the URL decomposition features of
>>> the HTML anchor elements. Last time I tested those (when I looked at the
>>> HTML spec), I noticed that browsers vary in things like
>>>
>>> - how to default the port number
>>> - whether the returned query part starts with a question mark
>>> - empty string vs exception
>>>
>>> I can see why it would be attractive to reduce the differences, but this
>>> *really* is an HTML API problem that is only indirectly related with
>>> parsing
>>> references.
>>
>> There are two layers here.  The first is the underlying processing of
>> the URL and the second is the APIs we have to probe that underlying
>> processing.  The two belong in separate specifications.  However, in
>> order to test the former, we need to use the later.  We need to use
>> our human judgement to understand which aspects of the behavior we
>> observe are a result of the former and which are the result of the
>> later.
>
> There may be other ways to test decomposition. One way is to use
> relative-to-absolute resolution. Please have a look at
> http://www.w3.org/2004/04/uri-rel-test.html. This page also contains some
> tests for the resolution of reg-names with percent-encoding. Links to more
> tests and some discussion can be found at
> http://www.w3.org/2004/04/uri-rel-test.html. Suggestions welcome.

Those tests are extremely basic and I believe already covered by the
test suite above.

>>> Questions:
>>>
>>> 1) For the sake of testing *parsing*, can we simplify the tests so that
>>> they
>>> don't "fail" for cases like port number defaulting? Is this difference
>>> even
>>> relevant in any practical way? Does any JS code out there depend on this?
>>
>> The PASS/FAIL judgement isn't relevant to our work.
>
> Is it relevant for something else? If yes, please say so on the test page.
> If not, let's fix it.

Once we're done writing the spec (or as the spec becomes more stable),
I'll change the PASS/FAIL notations to match the spec.  At the moment,
there is no spec, so no particular behavior is distinguishable from
others as "correct."

>>> 2) Is there another way we can test browsers in a more relevant way? Just
>>> because it's easy to test a specific API doesn't mean that this is what
>>> will
>>> tell us anything of significance.
>>
>> I'm open to suggestions.  The main requirement is that we can perform
>> the test in a black-box manner so that we can test both open- and
>> closed-source implementations.
>
> See above for suggestions.

Thanks for the suggestions.

Adam

Received on Wednesday, 20 April 2011 01:26:28 UTC