Re: On the way to CR for webrtc-pc from Philip Jägenstedt on 2017-04-24 (public-webrtc@w3.org from April 2017)

From: Philip Jägenstedt <foolip@google.com>
Date: Mon, 24 Apr 2017 17:12:04 +0000
To: Alexandre GOUAILLARD <agouaillard@gmail.com>
Cc: Stefan Håkansson LK <stefan.lk.hakansson@ericsson.com>, "public-webrtc@w3.org" <public-webrtc@w3.org>
Message-ID: <CAARdPYeQ8_f__+xb+ZZSGPALFN66-d3mn6Ec-dZ9nNv1cG6f-Q@mail.gmail.com>
On Tue, Apr 18, 2017 at 1:51 AM Alexandre GOUAILLARD <agouaillard@gmail.com>
wrote:

> I'm not super familiar with what transitioning to CR entails, but can you
>> say something about what goals you have for interoperability, in both the
>> "implementations pass the same tests" and "two implementations can
>> communicate" sense) sense?
>>
>>
> Hi philip,
>
> Dom, harald and myself have been working on the test suite for media
> stream and webrtc(-pc). I have been so far in charge of reporting to the WG
> during meetings, and especially TPAC. MediaStream received more love than
> webrtc so far. After discussion with dom, harald, stefan and dan burnett,
> among others, the consensus was that there is no tool to compute a
> "coverage" of the spec, so for TPAC in lisbon last year I computed a
> "manual" coverage by counting all the formative lines in the specs, and
> checking from the test which were tested. While media stream was above 50%
> (still not a feat), webrtc was below 25%.  The granularity is the line,
> which is less than ideal, but it was a good start, and we had absolutely no
> visibility before that.
>

Yeah, I think that there's no good answer for determine coverage of a spec
using the spec and the tests alone. The closest I've seen was here:
https://groups.google.com/a/chromium.org/d/msg/blink-dev/qpY1dj_ND-Q/L8TuDBx6DAAJ

That required a reference implementation though, and for WebRTC that would
be a "somewhat" larger undertaking.

25% is a larger number than I would have guessed, given that webrtc/ has
only 13 test files in total, but it depends on the granularity of course.
What is a line?

Some discussions took place about parts of the specs (the algorithms
> description) being normative or informative and other details. Discussion
> about which version should be the reference (editor draft vs latest draft)
> also took place.
>
> Some discussions took place about parts of the specs not being testable
> without "interoperability" testing, and I presented a possible framework
> for it.
>

I'm curious about the specifics here. Are there things that cannot be
tested in a single-browser setting by connecting to a a test-controlled
server that behaves as another browser?

Some discussions took place about the problem of testing (even manually)
> parts of the specs that require interaction with the security prompt. While
> automating the interaction with the security prompt is a challenge in
> itself, workaround have been implemented by at least by FF and Cr using
> command line arguments, and profiles. However, it supposes the setting
> remains the same during the entire test, while testing the spec would
> require, e.g. to cancel an already provided approval to hardware access
> before the promise return with the corresponding media stream. Fake sources
> can also be provided the same way. The safari way, which allow for
> programatic modification of the settings in the tests, and could test those
> cases, was presented.
>
> Some discussion took place about having both FF and Cr using libwebrtc
> under the hood. Was it different enough to uphold the 2 browsers rule. The
> answer from Dom / W3C was that, as far as W3c was concerned, the
> implementations were different enough (network stack, ice stack, ...).
>
>
>
>
>> In my work to bring Blink's Web IDL files closer into alignment with the
>> specs that we link to, I'm mostly looking for non-standard things that need
>> attention, but I also notice things that are in specs but not in Blink.
>>
>> On RTCPeerConnection, there's at least currentLocalDescription,
>> pendingLocalDescription, currentRemoteDescription, pendingRemoteDescription
>> and canTrickleIceCandidates. There are also things from partial interfaces
>> that are harder to spot, like the sctp attribute. Some of these are not in
>> Gecko either.
>>
>> Would it be helpful to the WG with a list of things that seem to
>> implemented in <2 browsers, or what criteria are you using?
>>
>> Regarding test coverage,
>> https://github.com/w3c/web-platform-tests/tree/master/webrtc is fairly
>> limited and improving automated testing seems blocked on things like
>> https://github.com/w3c/web-platform-tests/issues/5563. Automation aside,
>> how will you determine the coverage, i.e. how do you make sure that the
>> sctp attribute is tested and that a lack of implementations would show up
>> as test failures?
>>
>
> So far, manually :-(  I'm open to suggestions!
>

Starting from a list of APIs in the spec, one could grep to see which don't
appear in the tests at all, e.g. `git grep -w sctp` doesn't find anything.

Better still would be a custom Firefox or Chromium build that logs at the
entry point of each API, running the tests, and checking what was never
logged. (I might do it using Blink's UseCounter system somehow, because it
already adds code at the right places in generated bindings code.)

Using either of these methods ought to reveal what things are completely
untested.

To actually know the depth of testing for any API is much harder. Rather
than trying to solve it directly, one could assume that code that has
landed in Gecko and Chromium has good enough tests, and instead focus on
upstreaming such tests. But that would probably require solving some
automation problems first.
Received on Monday, 24 April 2017 17:12:48 UTC