RE: XHR LC comments from Sunava Dutta on 2008-05-05 (public-webapi@w3.org from May 2008)

From: Sunava Dutta <sunavad@windows.microsoft.com>
Date: Mon, 5 May 2008 14:58:35 -0700
To: "public-webapi@w3.org" <public-webapi@w3.org>
CC: IE8 Core AJAX SWAT Team <ieajax@microsoft.com>
Message-ID: <083D18C6B9B71F4CBCA7B76D97B7483102C85D8DFE@NA-EXMSG-W601.wingroup.windeploy.ntd>
Julian Reschke wrote:
a) I'm confused about the approach to this document. On the one hand,
we're being told that it can't define anything not already in use (and
that new stuff belongs into XHR2), on the other hand it relies on HTML5,
which is a moving target. It's good that this stuff is being written
down, but if it relies on HTML5, I'd propose to consider other
publication options.

>>+1. I had suggested something along the lines of not linking to other specifications that are moving targets but other publication options if we do decide to go this route are fine.

"Ensure all new entities like constants, flags etc are versioned or in a new object.
The draft frequently cross references specifications in the W3C.For example, the spec makes references to the DOM 3 events and core, namespaces in XML, Window Object 1.0 etc (Some of which are drafts themselves). We fail to see the value in implicitly embedding other large specs. Simplicity and standing on its own would be good."

Julian Reschke wrote:

b) Algorithms: the spec uses a method to describe algorithms that IMHO
is extremely hard to read (see for instance send() method). This may be
good for implementors, but seems to be bad for everybody else.
Minimally, the lists should be structured for better readability.

>>+1

Julian Reschke wrote:
c)
"- TRACK??? There's probably a rational for that. If there is, please
include it in the spec."

>>TRACK is unsafe and should be removed. I remember reading about this awhile back. Here's something I found on the web: http://www.aqtronix.com/Advisories/AQ-2003-02.txt

Julian Reschke wrote:
d)
""If the user argument was not omitted and is not null let stored user be
user  encoded using the encoding specified in the relevant
authentication scheme or UTF-8 if the scheme fails to specify an encoding."

Why is XHR talking about the encoding here? Is "stored user" a string or
a byte array?

(same for password)"


>>+1. I'm not quite sure what this means and the relevancy.

Julian Reschke wrote:
e)
"For security reasons, these steps should be terminated if the header
argument case-insensitively matches one of the following headers:

     * Accept-Charset
     * Accept-Encoding
     * Connection
     * Content-Length
     * Content-Transfer-Encoding
     * Date
     * Expect
     * Host
     * Keep-Alive
     * Referer
     * TE
     * Trailer
     * Transfer-Encoding
     * Upgrade
     * Via "

It's unclear why there's a security reason not to allow things like
"Accept-Charset" or "Accept-Encoding". Please explain."

>>+1. I've given this feedback before but haven't heard back anything. We should mention why these are bad and also consider what is currently allowed today by UA's.
http://lists.w3.org/Archives/Public/public-webapi/2008Apr/0191.html




-----Original Message-----
From: public-webapi-request@w3.org [mailto:public-webapi-request@w3.org] On Behalf Of Julian Reschke
Sent: Sunday, May 04, 2008 2:47 AM
To: public-webapi@w3.org
Subject: XHR LC comments


Review of <http://www.w3.org/TR/2008/WD-XMLHttpRequest-20080415/>.

General points:

a) I'm confused about the approach to this document. On the one hand,
we're being told that it can't define anything not already in use (and
that new stuff belongs into XHR2), on the other hand it relies on HTML5,
which is a moving target. It's good that this stuff is being written
down, but if it relies on HTML5, I'd propose to consider other
publication options.

b) Algorithms: the spec uses a method to describe algorithms that IMHO
is extremely hard to read (see for instance send() method). This may be
good for implementors, but seems to be bad for everybody else.
Minimally, the lists should be structured for better readability.

c) Structure: It would be nice if Section 4 had more structure. Right
now it's ugly to navigate and refer to.



2.1 Dependencies

"DOM

     A conforming user agent must support some subset of the
functionality defined in DOM Events and DOM Core that this specification
relies upon. [DOM2Events] [DOM3Core]"

That reads a bit strange. Must the subset be non-empty?


2.2 Terminology

"Two URIs are same-origin if after performing scheme-based normalization
on both URIs as described in section 5.3.3 of RFC 3987 the scheme, ihost
and port components are identical. If either URI does not have an ihost
component the URIs must not  be considered same-origin. [RFC3987]"

Why are we referring to the IRI spec (RFC3987) when talking about URIs,
as defined RFC3986?


3. Security Considerations

"Apart from requirements affecting security made throughout this
specification implementations may, at their discretion, not expose
certain headers, such as HttpOnly cookies."

"...such as headers containing HttpOnly cookies".

Besides that: this may be a non-optimal example unless we can point to a
definition of "HttpOnly cookies". Can we?


4. The XMLHttpRequest Object

"onreadystatechange of type EventListener

     This attribute is an event handler DOM attribute and must be
invoked whenever a readystatechange event is targated at the object."

s/targated/targeted/

"If stored method case-insensitively matches  CONNECT, DELETE, GET,
HEAD, OPTIONS POST, PUT, TRACE, or TRACK let stored method be the
canonical uppercase form of the matched method name."

- missing comma after OPTIONS
- TRACK??? There's probably a rational for that. If there is, please
include it in the spec.


"If the user argument was not omitted and is not null let stored user be
user  encoded using the encoding specified in the relevant
authentication scheme or UTF-8 if the scheme fails to specify an encoding."

Why is XHR talking about the encoding here? Is "stored user" a string or
a byte array?

(same for password)


"Abort the send() algorithm, set response entity body to "null" and
reset the list of request headers."

This is a very confusing statement, until one realizes that it's in the
description of "open", not "send". It would be good to rephrase this so
it becomes clear that this aborts a *previous* send request.



"If the value argument is null terminate these steps. (Do not raise an
exception.)."

This makes it impossible to set empty headers, which are allowed in
HTTP. Even worse, it silently fails.

This is profiling of the base spec, and I don't see any compelling
reason to do so. Don't do it.


"For security reasons, these steps should be terminated if the header
argument case-insensitively matches one of the following headers:

     * Accept-Charset
     * Accept-Encoding
     * Connection
     * Content-Length
     * Content-Transfer-Encoding
     * Date
     * Expect
     * Host
     * Keep-Alive
     * Referer
     * TE
     * Trailer
     * Transfer-Encoding
     * Upgrade
     * Via "

It's unclear why there's a security reason not to allow things like
"Accept-Charset" or "Accept-Encoding". Please explain.


General comment on "setRequestHeader(header, value), method": the way it
is specified makes it impossible for a client to reliably set headers.
We need a way to either retrieve the current value for inspection, or a
way to reset the header. Or both.


"If stored method is GET act as if the data argument is null."

Another case of HTTP/1.1 being profiled. Don't do it.


"Serialize data into a namespace well-formed XML document and encoded
using the encoding given by data.inputEncoding, when not null, or UTF-8
otherwise. Or, if this fails because the Document cannot be serialized
act as if data  is null."

Silent failure????


"If no Content-Type header has been set using setRequestHeader() append
a Content-Type header to the list of request headers with a value of
application/xml;charset=charset  where charset is the encoding used to
encode the document."

This will result in an invalid Content-Type header if the UA has
initialized the headers with a default (which I think the spec currently
allows; and at least one UA was reported to do). See comment above about
header handling.


"While downloading the resource the following rules are to be observed."

That reads strange. HTTP requests do not "download" resources. Say
"while executing the request".


"If the user agent supports HTTP State Management it should persist,
discard and send cookies (as received in the Set-Cookie and Set-Cookie2
response headers, and sent in the Cookie header) as applicable. [RFC2965]"

This should probably include a reference to the Set-Cookie (not
Set-Cookie2) spec as well (RFC2109).


"If the user agent implements server-driven content-negotiation it
should set Accept-Encoding and Accept-Charset headers as appropriate; it
must not automatically set the Accept."

s/set the Accept/set the Accept header/


"Responses to such requests must have the content-encodings
automatically decoded. [RFC2616]"

"Such requests" is a bit fuzzy here. Please rephrase in terms of what
the response contains (such as Content-Encoding header being present etc).


"// The following script:
var client = new XMLHttpRequest();
client.open("GET", "test.txt", true);
client.send();
client.onreadystatechange = function() {
  if(this.readyState == 3) {
   print(this.getAllResponseHeaders());
  }
}

// ...should output something similar to the following text:
Date: Sun, 24 Oct 2004 04:58:38 GMT
Server: Apache/1.3.31 (Unix)
Keep-Alive: timeout=15, max=99
Connection: Keep-Alive
Transfer-Encoding: chunked
Content-Type: text/plain; charset=utf-8"

I think examples like this would be more readable (and take less space)
when using the syncr. mode.


"status of type unsigned short, readonly

On getting, if available, it must return the HTTP status code sent by
the server (typically 200 for a successful request). Otherwise, if not
available, the user agent must raise an INVALID_STATE_ERR exception."

This may be incorrect when the UA caches (304 vs 200).


"statusText of type DOMString, readonly

     On getting, if available, it must return the HTTP status text sent
by the server (appears after the status code). Otherwise, if not
available, the user agent must raise an INVALID_STATE_ERR exception."

Really? It seems to me that if somebody really implements this, clients
are likely to break. Why not allow an empty string here?


Finally, my main other issue with this spec that it is silent about the
recommended behaviour for unsafe methods, about which RFC2616 says in
Section 9.1.1
(<http://greenbytes.de/tech/webdav/rfc2616.html#rfc.section.9.1.1>):

"Implementors should be aware that the software represents the user in
their interactions over the Internet, and should be careful to allow the
user to be aware of any actions they might take which may have an
unexpected significance to themselves or others.

In particular, the convention has been established that the GET and HEAD
methods SHOULD NOT have the significance of taking an action other than
retrieval. These methods ought to be considered "safe". This allows user
agents to represent other methods, such as POST, PUT and DELETE, in a
special way, so that the user is made aware of the fact that a possibly
unsafe action is being requested."

Thus, allowing a web page to submit a PUT, POST or DELETE request
without user interaction seems to be a very dangerous thing to me, and
the spec should point that out (see also
<http://ietf.osafoundation.org:8080/bugzilla/show_bug.cgi?id=237>).
Received on Monday, 5 May 2008 21:59:32 UTC