Re: On the way to CR for webrtc-pc from Philip Jägenstedt on 2017-05-05 (public-webrtc@w3.org from May 2017)

From: Philip Jägenstedt <foolip@google.com>
Date: Fri, 05 May 2017 13:08:46 +0000
To: Alexandre GOUAILLARD <agouaillard@gmail.com>
Cc: Stefan Håkansson LK <stefan.lk.hakansson@ericsson.com>, "public-webrtc@w3.org" <public-webrtc@w3.org>
Message-ID: <CAARdPYe6V5oMcDvkw=Fb9N82PaYyNDnODsfSm3NFqw0sU0p0+w@mail.gmail.com>

On Tue, Apr 25, 2017 at 9:06 AM Alexandre GOUAILLARD <agouaillard@gmail.com>
wrote:

> Dom, harald and myself have been working on the test suite for media
>>> stream and webrtc(-pc). I have been so far in charge of reporting to the WG
>>> during meetings, and especially TPAC. MediaStream received more love than
>>> webrtc so far. After discussion with dom, harald, stefan and dan burnett,
>>> among others, the consensus was that there is no tool to compute a
>>> "coverage" of the spec, so for TPAC in lisbon last year I computed a
>>> "manual" coverage by counting all the formative lines in the specs, and
>>> checking from the test which were tested. While media stream was above 50%
>>> (still not a feat), webrtc was below 25%.  The granularity is the line,
>>> which is less than ideal, but it was a good start, and we had absolutely no
>>> visibility before that.
>>>
>>
>> Yeah, I think that there's no good answer for determine coverage of a
>> spec using the spec and the tests alone. The closest I've seen was here:
>>
>> https://groups.google.com/a/chromium.org/d/msg/blink-dev/qpY1dj_ND-Q/L8TuDBx6DAAJ
>>
>>
> I was thinking about using the tag used to link to the spec, with line
> number at the end to do that, but it s kind of a side project, and it would
> make the test files much larger.
>
>
>> That required a reference implementation though, and for WebRTC that
>> would be a "somewhat" larger undertaking.
>>
>> 25% is a larger number than I would have guessed, given that webrtc/ has
>> only 13 test files in total, but it depends on the granularity of course.
>> What is a line?
>>
>>
> As in a line of the printed specifications that correspond to a normative
> part.
> We did it again yesterday, here is the process we followed:
>
> - You do a first pass that eliminates the non normative subsections
> (examples, intro, changeling, ...)
> - you do a second pass that eliminates the non-normative paragraphs in
> remaining sections
>   -- remove MAY and SHOULD
>

There are 76 may and 21 should in the spec, was any line with any of those
words removed?


>   -- remove algorithms descriptions, ...
>

Do you mean non-normative algorithm descriptions? Unfortunately the
normative and non-normative are often not well separated in this spec,
otherwise just removing (with CSS) things based on class="informative" and
similar should suffice.

As an example, was the description of the RTCIceCandidate constructor in
https://w3c.github.io/webrtc-pc/#rtcicecandidate-interface removed in this
method? That's the only thing that actually says what the constructor
should do.


> whatever is left represents your 100%.
>
> Then you take the existing tests and you look up the corresponding lines
> in the specification document (editor draft march 2017). That gives you an
> approximate coverage.
>
> Whatever is left is to be implemented. We are implementing tests for
> section 10 (media stream / media stream track) and 4 (legacy peerconection)
> right now.
>

Big picture, I think this is a reasonable way to estimate coverage.

Because it's a manual process that has to be repeated each time, here's
what I would do if I wanted to ensure coverage of a big and complicated
thing with poor existing coverage:

   - Start with the source of the spec.
   - Remove everything non-normative.
   - Pick any API, any algorithm, anything normative.
   - For each line of an algorithm or similar granularity of normative
   requirement, look for a test matching it, and if found remove that line.
   - If there are references to other specs (like JSEP), make sure they are
   themselves tested by this or another test suite.
   - Commit and repeat until the document is empty, or only non-tested
   things remain.
      - Final check: Pick a random API from the IDL and read all of the
      related tests. If it's not actually well tested, maybe the spec
doesn't say
      enough normative things about it.

Received on Friday, 5 May 2017 13:09:32 UTC