Re: On the way to CR for webrtc-pc from Alexandre GOUAILLARD on 2017-04-25 (public-webrtc@w3.org from April 2017)

From: Alexandre GOUAILLARD <agouaillard@gmail.com>
Date: Tue, 25 Apr 2017 15:06:51 +0800
To: Philip Jägenstedt <foolip@google.com>
Cc: Stefan Håkansson LK <stefan.lk.hakansson@ericsson.com>, "public-webrtc@w3.org" <public-webrtc@w3.org>
Message-ID: <CAHgZEq4=N-45rGBEjE=wS92EdVhruYAQ9LBCpfNpGwj-RD70Jg@mail.gmail.com>
> Dom, harald and myself have been working on the test suite for media
>> stream and webrtc(-pc). I have been so far in charge of reporting to the WG
>> during meetings, and especially TPAC. MediaStream received more love than
>> webrtc so far. After discussion with dom, harald, stefan and dan burnett,
>> among others, the consensus was that there is no tool to compute a
>> "coverage" of the spec, so for TPAC in lisbon last year I computed a
>> "manual" coverage by counting all the formative lines in the specs, and
>> checking from the test which were tested. While media stream was above 50%
>> (still not a feat), webrtc was below 25%.  The granularity is the line,
>> which is less than ideal, but it was a good start, and we had absolutely no
>> visibility before that.
>>
>
> Yeah, I think that there's no good answer for determine coverage of a spec
> using the spec and the tests alone. The closest I've seen was here:
> https://groups.google.com/a/chromium.org/d/msg/blink-dev/
> qpY1dj_ND-Q/L8TuDBx6DAAJ
>
>
I was thinking about using the tag used to link to the spec, with line
number at the end to do that, but it s kind of a side project, and it would
make the test files much larger.


> That required a reference implementation though, and for WebRTC that would
> be a "somewhat" larger undertaking.
>
> 25% is a larger number than I would have guessed, given that webrtc/ has
> only 13 test files in total, but it depends on the granularity of course.
> What is a line?
>
>
As in a line of the printed specifications that correspond to a normative
part.
We did it again yesterday, here is the process we followed:

- You do a first pass that eliminates the non normative subsections
(examples, intro, changeling, ...)
- you do a second pass that eliminates the non-normative paragraphs in
remaining sections
  -- remove MAY and SHOULD
  -- remove algorithms descriptions, ...

whatever is left represents your 100%.

Then you take the existing tests and you look up the corresponding lines in
the specification document (editor draft march 2017). That gives you an
approximate coverage.

Whatever is left is to be implemented. We are implementing tests for
section 10 (media stream / media stream track) and 4 (legacy peerconection)
right now.


> Some discussions took place about parts of the specs (the algorithms
>> description) being normative or informative and other details. Discussion
>> about which version should be the reference (editor draft vs latest draft)
>> also took place.
>>
>> Some discussions took place about parts of the specs not being testable
>> without "interoperability" testing, and I presented a possible framework
>> for it.
>>
>
> I'm curious about the specifics here. Are there things that cannot be
> tested in a single-browser setting by connecting to a a test-controlled
> server that behaves as another browser?
>

That would mean that your test-controlled server also implement all of
HTML5 (or at least all needed by webrtc). This is a huge amount of work,
which is already done in the browser.


> Automation aside, how will you determine the coverage, i.e. how do you
>>> make sure that the sctp attribute is tested and that a lack of
>>> implementations would show up as test failures?
>>>
>>
>> So far, manually :-(  I'm open to suggestions!
>>
>
> Starting from a list of APIs in the spec, one could grep to see which
> don't appear in the tests at all, e.g. `git grep -w sctp` doesn't find
> anything.
>
> Better still would be a custom Firefox or Chromium build that logs at the
> entry point of each API, running the tests, and checking what was never
> logged. (I might do it using Blink's UseCounter system somehow, because it
> already adds code at the right places in generated bindings code.)
>
>
A weaponized browser is a nice idea. You would have to do it at the
binding, but that would be the ultimate weapon to see which API are getting
hit or not. That would only provide binary result though, if APIs are
suppose to be used in a certain order, and/or they have several behavior
depending on arguments, I'm not sure if what you propose would see the
difference. It's far beyond the capacity of common mortals though, while
writing tests for wpt is doable for a JS dev.

Using either of these methods ought to reveal what things are completely
> untested.
>
> To actually know the depth of testing for any API is much harder. Rather
> than trying to solve it directly, one could assume that code that has
> landed in Gecko and Chromium has good enough tests, and instead focus on
> upstreaming such tests. But that would probably require solving some
> automation problems first.
>

On it :-)


-- 
Alex. Gouaillard, PhD, PhD, MBA
------------------------------------------------------------------------------------
President - CoSMo Software Consulting, Singapore
------------------------------------------------------------------------------------
sg.linkedin.com/agouaillard

   -
Received on Tuesday, 25 April 2017 07:07:25 UTC