RE : Raw sockets feedback from Mozilla Network team from Ke-Fong Lin on 2013-08-30 (public-sysapps@w3.org from August 2013)

From: Ke-Fong Lin <ke-fong.lin@4d.com>
Date: Fri, 30 Aug 2013 16:43:55 +0200
To: "Nilsson, Claes1" <Claes1.Nilsson@sonymobile.com>, Jonas Sicking <jonas@sicking.cc>, "public-sysapps@w3.org" <public-sysapps@w3.org>, Patrick McManus <mcmanus@ducksong.com>
Message-ID: <73C0C0720B9162459C8AF58ED19AD5B65E1C72D1A0@4d-xn1-exch>
Hi everyone,

Ok, I've put some comment on the issues and added one for the creation of a non-normative "usage note".
In fact, we should may be have an internal document explaining why we made some design choices.

I'll have more to say regarding issue 46. Later in the day or tomorrow. This is really an important and interesting issue.

And as for issue 45, we can be much precise regarding the "backlog". Especially the wording, what does "backlog" means : "pending connection handshakes" or "handshaked hence connected, ready".
listen() is not very precise but there are other syscalls which allow more control.

Best regards,

________________________________________
De : Nilsson, Claes1 [Claes1.Nilsson@sonymobile.com]
Date d'envoi : mardi 27 août 2013 20:39
À : Ke-Fong Lin; Jonas Sicking; public-sysapps@w3.org; Patrick McManus
Objet : RE: Raw sockets feedback from Mozilla Network team

In order to facilitate further work and tracking on the issues from Patrick I have created separate issues at the SysApps Raw Socket Github repository:

https://github.com/sysapps/raw-sockets/issues/36
https://github.com/sysapps/raw-sockets/issues/37
https://github.com/sysapps/raw-sockets/issues/38
https://github.com/sysapps/raw-sockets/issues/39
https://github.com/sysapps/raw-sockets/issues/40
https://github.com/sysapps/raw-sockets/issues/41
https://github.com/sysapps/raw-sockets/issues/42
https://github.com/sysapps/raw-sockets/issues/43
https://github.com/sysapps/raw-sockets/issues/44
https://github.com/sysapps/raw-sockets/issues/45
https://github.com/sysapps/raw-sockets/issues/46

Patrick's comment on security was added as a comment in https://github.com/sysapps/raw-sockets/issues/10.

I suggest that further discussion on these issues take place at the Github issue list.

Best regards
 Claes

> -----Original Message-----
> From: Nilsson, Claes1 [mailto:Claes1.Nilsson@sonymobile.com]
> Sent: den 26 augusti 2013 18:16
> To: 'Ke-Fong Lin'; Jonas Sicking; public-sysapps@w3.org; Patrick
> McManus
> Subject: RE: Raw sockets feedback from Mozilla Network team
>
> Hi,
>
> Thanks a lot Patrick for reviewing this specification and Jonas and Ke-
> Fong for commenting. Your feedback is very valuable.
>
> Patrick, are you located in Toronto and able to participate in the F2F-
> meeting? We are discussion this API at 11 tomorrow Aug 27 and probably
> also in a "task force session" on Wednesday 28th 14-18.
>
> See my replies inline below.
>
> BR
>   Claes
>
> > -----Original Message-----
> > From: Ke-Fong Lin [mailto:ke-fong.lin@4d.com]
> > Sent: den 26 augusti 2013 14:17
> > To: Jonas Sicking; public-sysapps@w3.org; Patrick McManus
> > Subject: RE : Raw sockets feedback from Mozilla Network team
> >
> > Hi everyone,
> >
> > See my comments inside the text.
> >
> >
> >
> > Regards,
> >
> >
> >
> >
> >
> > Ke-Fong Lin
> > Développeur Senior
> >
> > 4D SAS
> > 60, rue d'Alsace
> > 92110 Clichy
> > France
> >
> > Standard :
> > Email :    Ke-Fong.Lin@4d.com
> > Web :      www.4D.com
> >
> >
> > ________________________________________
> > De : Jonas Sicking [jonas@sicking.cc]
> > Date d'envoi : lundi 26 août 2013 01:26 À : public-sysapps@w3.org;
> > Patrick McManus Objet : Raw sockets feedback from Mozilla Network
> team
> >
> > Hi All,
> >
> > I asked Patrick McManus from the Mozilla Network team to have a look
> > over the Raw Sockets draft. Here's his feedback (please keep Patrick
> > on cc for the replies since he's not subscribed to this list):
> >
> > * the concept of an isolated "default local interface" (used in a few
> > different places) doesn't really align with networking.. generally
> > when a local interface isn't specified for a socket the one it is
> > assigned is derived from looking up the remote address in the routing
> > table and taking the address of the interface with the most preferred
> > route to the remote address.. This is equally true of TCP and UDP.
> >
> > think about a case where you've got 3 interfaces defined on your
> > machine - 192.168.16.1 which is a natted address used to connect to
> > the internet, 130.215.21.5 which is an address assigned to you while
> > you're connected to your university's VPN, and 127.0.0.1 (localhost).
> >
> > Without additional context - none of those qualify as the default
> > local interface. What generally happens is that when you ask to
> > connect to
> > 8.8.8.8 your local address is assigned to be 192.168.16.1 because
> your
> > Internet route will be used for 8.8.8.8.. but if you ask to connect
> to
> > 130.215.21.1 your local address is assigned to be 130.215.21.5.. and
> > if you want to connect to 127.0.0.1 your local address is also
> 127.0.0.1.
> > So the remote address and the routing table matter - there really
> > isn't a default local address outside of that context.
> >
> > so in general whenever you want a local interface (and you did not
> > explicitly provide one) it can only be determined after your provide
> > the remote address and a system call is made to consult the routing
> > table.
> >
> > you specifically asked about
> > https://github.com/sysapps/raw-sockets/issues/24 .. I'm not concerned
> > about blocking IO here.. the address lookup will require a system
> call
> > but its just another kernel service with a quick response.. no
> > different than gettimeofday() or something really. To me the issue is
> > really just that the concept of assigning a local address is
> > nonsensical until you have assigned the remote one.
> >
> > >>>
> >
> > [Ke-Fong]: When you bind with INADDR_ANY, that is indeed the
> behavior:
> > Use the network interface according to routing.
> > We're a bit higher level from this, having a specified default
> network
> > interface makes sense and would be needed.
> > Suppose my smartphone has 3G, wifi, and bluetooth, they can all act
> as
> > network interfaces.
> > Yet, I may don't want to use 3G because it's usually expensive and
> use
> > wifi instead.
>
> [Claes]  So this means that for TCP a better description on the process
> for allocating a local address is needed in the case when a specific
> local address is not defined by the web application but I don't think
> that the API itself has to change?
>
> For the case when a local interface needs to be stated by the web
> application I refer to the latest comments in
> https://github.com/sysapps/raw-sockets/issues/24, i.e. that the phase 2
> SysApps API, Network Interface API, should be used to list the
> available networks so that a specific interface can be selected by the
> application.
>
> However, your description above on how the routing table is used to set
> the local address based on the remote address to connect to works for
> TCP. What applies then to UDP? With the constructor's options argument
> we can set an optional "default remote address" for subsequent send()
> calls. So if this option is used I assume that the routing table can be
> used in the same manner as for TCP. However, which is the procedure if
> we don't set this option? Does that mean that the local address can't
> be determined until we issue a send() call?
>
> Regarding setting the localAddress attribute I assume that Patrick
> means that for TCP the localAddress attribute immediately could be set
> to the actual value during the execution of the constructor, not to
> null, even if it is not explicitly specified in the constructor.
> Correct?
>
> >
> > >>>
> >
> > * "bind the socket to any available randomly selected local port" -
> > its not clear you want to say randomly here. Sometimes local ports
> are
> > assigned sequentially according to availability.
> >
> > >>>
> >
> > [Ke-Fong]: The intended meaning : Use standard behavior of socket,
> > bind to one of those ephemeral ports they way bind() does it when
> given port as zero.
> > By "random", it rather meant let the system decide.
>
> [Claes] Yes, this is basically what I mean so I will improve the
> wording here.
> >
> > >>>
> >
> > * I don't really understand the loopback attribute. What does it mean
> > to set it to true but connect to 8.8.8.8? What does it mean to set it
> > to false but connect to 15.15.15.15 which your OS has bound to the
> > localhost interface? What purpose does it serve at all?
> >
> > >>>
> >
> > [Ke-Fong]: Loopback is only relevant for UDP multicast.  A note
> should
> > be added about that.
>
> [Claes] Yes, the use case for loopback is multicast. By setting the
> attribute the developer can define whether a sent multicast message
> should be received by yourself or not. I will clarify that this
> attribute is only applicable for multicast.
> >
> > >>>
> >
> > * I don't understand the onHalfClose event.. how do you know if the
> > server called half close or if it hung up completely? (they look the
> > same on the wire)
> >
> > >>>
> >
> > [Ke-Fong]: To "really" close a TCP connection, both peers have to
> > exchange FIN packets along with ACK packets.
> > It is indeed impossible to know if the peer has called shutdown() or
> > close() as both sends a FIN packet (which is part of the four steps
> > closing handshake).
> > But in the case of shutdown(), the peer can still receive data but
> not
> > send, whereas close() kills its socket descriptor.
> > Just imagine a client sending a single request to the server and then
> > doing a shutdown(), it can still read the answer of its request.
> > This is a case allowed by TCP protocol (and probably well used), it
> > ought to be supported.
> >
> > Indeed, a host never calling close() or halfclose(), will always
> > receive an "onhalfclosed" event instead of "closed". This can be
> > confusing.
> > A "usage note" section and a state diagram of the possible states
> > along with transitions would be helpful for understanding.
>
> [Claes] I am considering if we really need the onHalfClose event? Use
> cases:
> 1. A web app (Client) sends a request and expects a reply. Then the
> connection should be closed. So this means send() + halfclose(). The
> server ACKs the FIN and then sends the reply to the request that is
> ACKed by the Client. Then the Server sends FIN that is ACKed by the
> Client and the close event is issued to the web application.
> 2. A server sends FIN but the Client has more data in the send buffer.
> The Client ACKs the FIN, sends the data + FIN and issues the close
> event to the web application.
>
> >
> > >>>
> >
> > * there doesn't seem to be any discussion of the nagle algorithm or a
> > mapping to TCP_NODELAY anywhere in here and its an important topic
> for
> > TCP applications. I would suggest that you provide a TCP attribute
> > called sendCoalescing which defaults to false. Have the documentation
> > point out that this corresponds to the nagle algorithm, which in most
> > TCP APIs defaults to true/on, but because it is often the source of
> > performance problems we have changed the traditional default.
> > Applications that do a lot of small sends that aren't expecting
> > replies to each one (e.g. a ssh application) should enable nagle for
> > networking performance but most applications will not want to. A bit
> > more radically you could just disable nagle all the time without an
> > attribute, but if you do that the API document should really mention
> > it and the ssh client is an example of somewhere where such a config
> > is not optimal.
> >
> > >>>
> >
> > [Ke-Fong]: Ok, that needs to be addressed.
>
> [Claes] So would it be enough to have an additional Boolean field in
> the TCPOptions dictionary (type of the TCPSocket constructor's options
> argument) stating nagle true/false with default set to false?
>
> >
> > >>>
> >
> > * the TCP onMessage event should be called onData or something. A
> > message, at least in network parlance, is data with a preserved
> length..
> > UDP is like that - if you send a 500 byte message the receiver either
> > gets 500 bytes or nothing.. but TCP is all about data streams.. so if
> > you send 500 bytes in one call the receiver could end up with
> anywhere
> > from the first 1 to 500 bytes in its first read and TCP doesn't
> > provide any way to tell if it is just a partial down payment.. folks
> > used to TCP APIs will be used to that - its just the term "message"
> is
> > confusing.
> >
> > >>>
> >
> > [Ke-Fong]: I agree with you regarding the significance of "message"
> > in network parlance.
> > NodeJS has the "data" event instead.
> >
> > Web Worker or Web Socket specs use the onmessage callback to signify
> > that data has been received for reading.
> > Choice for using the "onmessage" name was to conform to the same kind
> > of naming.
> >
> > It can indeed be argued that in cas of websocket it is indeed a
> > message according to the websocket protocol.
> > Same for webworker, as onmessage is the result of a postMessage().
> >
> > In the case of UDPSocket, our spec is very specific that "onmessage"
> > is when a datagram (a "message") is received, so that can work. But
> > for TCPSocket, yes it doesn't make much sense and "data" would be
> better.
> >
>
> [Claes] Ok, if no objection  will change the name to onData for TCP.
>
> > >>>
> >
> > * While we're talking nomenclature please don't use the term "raw"
> > anywhere in this document. That is a well known networking term and
> it
> > doesn't mean access to TCP and UDP interfaces - this was brought up
> to
> > me by several folks at the IETF meeting who were confused about the
> > applicability of this spec because of the use of the term raw sockets.
> > (raw sockets generally give access to ethernet level framing in
> normal
> > networking parlance) These are "transport level socket interfaces" or
> > "tcp/udp sockets" or so on..
> >
> > >>>
> >
> > [Ke-Fong]: I agree with you. "Transport level socket" would the most
> > appropriate term.
>
> [Claes] Ok, yes I can see that "Transport level socket API" would
> probably be a better name but I guess that there will be many options
> on the appropriate name for this API :-).
>
> >
> > >>>
> >
> > * for the server socket API it should be called "onAccept" instead of
> > "onConnect" to match the commonly understood sockets API - accept()
> is
> > the system call you used to take an incoming connection. There
> doesn't
> > seem to be a compelling reason to invent new lingo for well
> understood
> > operations.
> >
> > >>>
> >
> > [Ke-Fong]: I'd rather think "onconnect" is good enough. It can match
> > other W3C's APIs and it is the correct term (contrary to "onmessage").
> > onaccept would definitely ring a bell for seasoned socket programmers,
> > but I'd rather think we should keep onconnect.
>
> [Claes] I don't have a strong opinion here but listening to Jonas and
> Ke-Fong "connect" seems to have slightly more support than "accept" so
> I propose to stay with "connect".
>
> >
> > >>>
> >
> > * the server socket API doesn't need an onOpen event.. there is
> > nothing that happens in between the constructor and onOpen that could
> > block
> >
> > >>>
> >
> > [Ke-Fong]: socket() + bind() + listen() have "immediate" non blocking
> > effect indeed.
>
> [Claes] [Claes] The reason for the open event for TCPServerSocket and
> UDPSocket was the concern that Jonas expressed in
> https://github.com/sysapps/raw-sockets/issues/24 but if there is no
> time consuming actions, e.g. allocating a local interface, we should
> remove the open event.
>
> >
> > >>>
> >
> > * some folks will question why the server socket API doesn't contain
> a
> > backlog attribute that corresponds to the listen() system call that
> is
> > traditionally part of the socket API
> >
> > >>>
> >
> > [Ke-Fong]: Yes, it should be added.
>
> [Claes] If I remember correctly it was said that the maximum number of
> connection requests that can be queued before the system starts
> rejecting incoming requests should not be controllable by web
> applications. But this might be a mistake so the attribute can easily
> be added if required.
> >
> > >>>
> >
> > * on security - we need to think about this a little harder. What
> does
> > it mean to be priv'd enough to use this API? Simply being an
> installed
> > app or being an audited/signed one? The security implications are
> > pretty staggering here and I'm pretty sure the answer needs to be
> more
> > than "unprivd js off a webpage can't do this". Our user's privacy is
> > pretty much undermined by allowing this.. I know this is desired as a
> > backwards looking bridge, but the truth is it brings new
> functionality
> > to the mobile platform and that platform ought to at least be dealing
> > only in TLS and DTLS as table stakes.. While I think TLS and DTLS
> > ought to be mandatory - at the very least they ought to possible and
> > it doesn't really look like that use case has been fully baked into
> > the API yet.
> >
> > >>>
> >
> > [Ke-Fong]: I'm supposed to help write the proposal in that regard.
> > The problem is the TLS and DTLS specs are rather complicated.
> > And there are quite a few questions about what to include or not.
> >
> > Having basic client capabilities (to allow secure connection to SMTPS
> > or HTTPS, etc) will be pretty easy.
> > Problem is mostly server.
>
> [Claes] Patrick, please see the discussion at
> https://github.com/sysapps/raw-sockets/issues/10 .
> >
> > >>>
> >
> > * I guess I'm also concerned about TCPSocket.send().. the definition
> > of it says that if it exceeds an internal buffer of unknowable size
> it
> > must close the socket and throw an error. How can an application use
> > that safely if it doesn't know what value will overrun the socket and
> > trigger the exception and a close?
> >
> > Rather than the true/false semantic being used as a return value here
> > (which requires the whole send be buffered) it would be traditional
> to
> > let the send accept 0->N bytes of the N bytes being sent and have
> that
> > (0->N value) be the return code. Partial sends are part and parcel of
> > stream APIs. That way if I have 4MB to send but you've only got 1MB
> of
> > buffers I don't have to magically guess that - I do a 4MB write, 1MB
> > gets buffered - 1MB is returned and I come back later to try and
> write
> > the next 3MB. (either immediately which probably returns 0 or after
> an
> > ondrain event).
> >
> > >>>[Ke-Fong]:
> >
> > The closing on buffer overflow is too much, and should be changed to
> a
> > (transient) error instead.
> > At previous F2F meeting, this matter has been discussed and it was
> > agreed that the buffer should have an implementation defined size.
> > I agree with that, except that spec should probably specify a minimum.
> >
> > The goal was to make the API easier to use. Partial sends (and reads)
> > are indeed everyday's life with stream API. Yet, this is a JavaScript
> > API, not low-level C.
> > The kernel is buffering the send anyway, send() rather means "queue
> > data to socket's kernel-side write buffer" than actual "send and
> > return me control when you've done it".
> > So it's better to let the API do the buffering, especially when
> you'll
> > have to handle it with ArrayBuffer otherwise. JavaScript is not that
> > good for dealing with raw binary data.
> > Also JavaScript functions are not to be blocking.
> >
> > NodeJS's network API works in a somewhat similar way, and it has a
> > proven track record.
> >
> > For all these reasons, I believe letting the API do the buffering +
> > ondrain when buffer is flushed, is the good way to proceed.
> > The current specification is perhaps a little vague and the ondrain
> > and can be enhanced.[Claes]
>
> [Claes] I don't have so much to add in addition to what Jonas and Ke-
> Fong says but as editor I am happy for tangible proposals for wording
> to clarify vague stuff.
> >
> > >>>
Received on Friday, 30 August 2013 14:52:55 UTC