Review of WebRC PC Spec from Eric Rescorla on 2017-05-30 (public-webrtc@w3.org from May 2017)

From: Eric Rescorla <ekr@rtfm.com>
Date: Tue, 30 May 2017 12:22:11 -0700
To: "public-webrtc@w3.org" <public-webrtc@w3.org>
Message-ID: <CABcZeBMHjJOiwU_Exp6Grd8DZ1k1Ub=9xL_XikXriP6gM9HcuA@mail.gmail.com>
OVERALL
I have not studied all the algorithms for how to handle asynchronous
task queuing, but we seem kind of inconsistent on exactly when:

- We report errors synchronously
- When the task becomes due, if conditions X apply, don't run
  it
- When the task comes due, run the algorithm and then check
  to see if condition X applies and then don't report.

More generally, there is an enormous amount of boilerplate for
handling deferred operations of various kinds. It would be
far better if there were a single generic algorithm for handling
all the deferred/queued operations and then we just defined
the inner loop of those.

Generally, I tend to think that more things should
be async... That would make reasoning about this a lot
easier.


There seems to be inconsistency about whether these tables
of enums are normative or non-normative. Compare S 4.2.6
to S 4.2.5. Both of these seem to be defined in JSEP.


This would all be a lot easier to read if the state definitions
came before the enormously long block describing how to process
every method.


A similar comment about pulling stuff out is with the boilerplate
about events not bubbling or being cancelable.



TECHNICAL
S 4.2.2.
Generally, we try to avoid sharing the same initial keying material
with two hashes, which seems to be explicitly what you propose
here. With that said:

- HMAC already has its own shortening algorithm (you hash)
- You should use HKDF to generate fresh subkeys for each
  hash algorithm.


   Note that the OAuth response "key" parameter is a JSON Web Key (JWK)
   or a JWK encrypted with a JWE format. Also note that this is the only
   OAuth parameter whose value is not used directly, but must be
   extracted from the "k" parameter value from the JWK, which contains
   the needed base64-encoded "mac_key".

It's been a while since I looked at this piece of OAuth, but what
key is this encrypted with?

Why are you requiring that the access token be self-encrypted rather
than (say) a table lookup. This is a reasonable procedure, but it's
not the only one, and not really the standard's business.


S 4.2.6.
   Gather ICE candidates for each media type in use (audio, video, and
   data). If the remote endpoint is not bundle-aware, negotiate only
   one audio and video track on separate transports.

The second sentence here (and below) are just information, they're
not instructions to the implementation, so they should be removed
or rewritten.


S 4.3.1
(operations queue)
This algorith seems to imply that operation is run synchronously
if the queue depth is one (step 6) whereas if it's not, it's run
asynchronously. This definitely isn't invisible in the callback
version (reentrancy) and isn't actually invisible in the promises
version because these operations have side effects. Is it really
the expectation that we should (for instance) start sending
media before the promise resolves? That seems odd.


(setDescription)
The says:
    To set an RTCSessionDescription description on an RTCPeerConnection
    object connection, enqueue the following steps:

    1. Let p be a new promise.

I may be confused, but I don't think that the promise creation and
resolution is in fact queued.

    2. If elements of the SDP were modified, then reject p with a newly
created
    InvalidModificationError and abort these steps.

This only applies to local decriptions.


This algorithm is extremely prescriptive about which error checks
you need to perform and in which order. This is externally observable
because I throw different errors. For instance, why is it the
spec's business whether I check for being in the right state before
or after I check for the SDP being modified.

    5. If the content of description is invalid, then reject p with a
    newly created InvalidAccessError and abort these steps.

What would invalid be here? Anything else?


    2. If description is applied successfully, the user agent must
    queue a task that ru ns the following steps:

Didn't we already queue above?


S 4.2.3 (createOffer)

   4. Return the result of enqueuing the following steps:
      1. Let p be a new promise.

This seems like another situation where a generic algorithm would
help.

(createAnswer)
   setLocalDescription without causing an error as long as
   setLocalDescription is called reasonably soon.

I have no idea what "reasonably soon" means


(setLocalDescription)
This algorithm leaves the state of the PC changed in case of
bogus inputs. However, JSEP requires that the state be unchanged
(see JSEP S 5.8)


(addIceCandidate)

   3. If both sdpMid and sdpMLineIndex are null, return a promise
   rejected with a newly created TypeError.

See above about consistency. Why is the promise rejected here
immediately but otherwise these are enqueued? How does that help?


(setConfiguration)

    3. For each url in server.urls parse url and obtain scheme
    name. If the scheme name is not implemented by the browser, or if
    parsing based on the syntax defined in [ RFC7064] and [RFC7065]
    fails, throw a SyntaxError.

This seems unwise as it makes it very hard to introduce new schemes.
Why aren't they just ignored?


S 4.3.3.
    Supporting the methods in this section is optional. However, if
    these methods are supported it is mandatory to implement according
    to what is specified here.

This doesn't match my memory of what we agreed when we added promises.
When was this decision made?


S 4.8.1.
Seems like we are missing:

- component-id
- extension-att-name
- extension-att-value

Arguably, we don't need the last two but we do need component-id

Also, you need to cite the relevant parts of 5245 here.

S 4.8.3.

    If no host candidate can reach the server, errorCode will be set
    to the value 701 which is out- side the STUN error code
    range. This error is only fired once per server URL while in the
    RTCIceGatheringState of "gathering".

This is kinda buried..



S 5.1.

   RID values passed into init.sendEncodings must be composed only of
   case-sensitive al- phanumeric characters (a-z, A-Z, 0-9) up to a
   maximum of 16 characters.

Characters aren't case-sensitive. I assume you mean that rids that
differ only in case are distinct.


S 5.2.
The advice here (center, scale, then crop) differs from a MUST
in JSEP S 3.6.

   If the original resolution exceeds the size limits in the attribute,
   the sender SHOULD apply downscaling to the output of the
   MediaStreamTrack in order to satisfy the limits.  Downscaling MUST
   NOT change the track aspect ratio.

S 5.2.
   An unique identifier for the last set of parameters
   applied. Ensures that setParameters can only be called based on a
   previous getParameters, and that there are no intervening changes.

Are there any restrictions on how this is generated.


Shouldn't the definition RTCRtpEncodingParameters mark a bunch of stuff
readonly?


S 5.3.
    Initialize track.label to the result of concatenating the string
"remote " with kind.

Does this mean I will have a lot of tracks labelled "remote video"? That
seems
unfortunate.


S 5.5.1

    One of the the hash function algorithms defined in the 'Hash function
    Textual Names' registry, initially specified in [ RFC4572] Section
    8. As noted in [JSEP] Section 5.2.1, the digest algorithm used for the
    fingerprint matches that used in the certificate signature.

I see that this was relaxed in 8122. I wonder if we wnt to relax it here
and in JSEP.


S 6.1.

    6. If negotiated is false and label is longer than 65535 bytes long,
throw a TypeError.
    7. If negotiated is false and protocol is longer than 65535 bytes long,
throw a TypeError.

Why not just restrict this to 65k in general.


S 6.2.
    5. Attempt to send data on channel's underlying data transport; if
    the data cannot be sent, e.g. because it would need to be buffered
    but the buffer is full, the user agent must abruptly close
    channel's underly- ing data transport with an error.

This seems like a very unsafe API. Why not instead have it fail?

S 6.4.
You seem to be saying that I can't garbage collect the data channel
as long as the SCTP connection is up even if the PC is closed? that
seems surprising.


S 7.2
If I am reading this correctly, the intertone gap can be infinite,
though tones can only be 6 s long?


S 9.1.
It's not quite clear to me when the IdP is ready? I guess when it has
called register?


S 9.3.
   LOGINDONE

I wonder if we should use WEBRTC-LOGINDONE or something



S 9.4.
It seems like some text about how to handle peerIdentity mismatches
would be helpful here, although I recognize the normative text is
elsewhere. I can supply some if people would like.


S 9.6.
Is there any valid reason to allow the protocol to be a DOMString as
opposed to ASCII or LDH? It's about to be shoved into a URL...


S 10.4.1.
Do you think we should have have the target peerIdentity, not just
isolated.


S 11.
Has someone tested these examples? I gave them a quick skim and saw
some stuff that seems like it might be concerning, but maybe it's already
been tested...

If nothing else, it would be good to:

- Refactor out the common code
- Explicitly state what's different

S 11.6.
has anyone checked this diagram recently?

-Ekr
Received on Tuesday, 30 May 2017 19:23:26 UTC