Re: [JSPreflight] - First Draft of JavaScript Preflight Injection online from Reitbauer, Alois on 2013-09-02 (public-web-perf@w3.org from September 2013)

From: Reitbauer, Alois <Alois.Reitbauer@compuware.com>
Date: Mon, 2 Sep 2013 12:31:26 +0000
To: Ilya Grigorik <igrigorik@google.com>
CC: Chase Douglas <chase@newrelic.com>, "public-web-perf@w3.org" <public-web-perf@w3.org>
Message-ID: <CE4A25A2.23FAA%alois.reitbauer@compuware.com>

[Alois] If it were that simple :-). In the general case you are right and that's what we do today. However, there are tons of cases where this breaks the page. I can provide some real world examples where this gets really tricky to impossible.

Please do, curious...

[Alois]

There are a lot of examples and we needed to build an extensive rule set to cover this at first sight simple problem. First of all you will have to wait after all meta tags. Executing JavaScript before meta tags might break the page. Another problem are base tags. You have to wait until after the base tag as it would not work any more otherwise. This means you have to block flushing the content until you reached a point where you are sure that these cannot occur anymore.

This doesn't explain why you need to come after meta tags - do you intend to rewrite all following resources? If so, you may want to take a look at Navigation Controller<https://github.com/slightlyoff/navigationcontroller> instead (and note that even there, we explicitly do not block on first load of controller.. exactly for the reasons I've raised earlier).

[Alois]

There are two separate problems here. One is with META tags the other one with BASE Tags.

META Tags:

There are certain meta tags where JavaScript execution before the tag breaks the page. An example is <meta http-equiv="X-UA-Compatible" content="IE=7" />. If you execute JavaScript before, the page most likely breaks. The same is true for meta tags which define character sets.

BASE Tags:

A JavaScript block before a META tag in many cases leads to a situation where the base tag gets applied incorrectly. The effect is simply that the wrong URLs are used to load resources.

This approach is also not applicable to the case where the page cannot be loaded.

Correct. I do agree that we need an error reporting mechanism, but (in my opinion), this proposal couples error reporting with a much heavier and unnecessary injection mechanism - we're adding another cookie mechanism, a full blocking JS resource, and a bunch of other complications. Which is why I'm suggesting a simpler, CSP-like implementation: on error, beacon some predefined error report to provided URL.

[Alois] The CSP case is very different from performance monitoring. How would this be able to cover single-page web apps?

Once again, I think this proposal mixes two completely distinct use cases:

#1 Error Logging
We want the client to beacon the NavTiming data if the navigation fails (e.g. DNS lookup succeeded, TCP handshake timed out). In this instance, simply beaconing the NavTiming object to some endpoint would be enough - caveat: if TCP failed, how do I even know where to beacon it? This is where we either get into some persistent settings (e.g. first you would have to have a successful navigation to get this instruction, and then persist it (ugh)), or we're into store-and-send-later type of semantics. In any case, this gets complicated quickly.. and I'm skeptical that new 'cookie-like' mechanisms are going to get us far -- this opens privacy holes, etc. There is just not much appetite for that nowadays. But.. this would be great to solve.

[Alois]
I do not think we need persistence in case of a TCP failure. This might just be your bad Starbuck WiFi. Additionally general broader connectivity failures for your datacenter will be cover with synthetic availability monitoring. RUM will never replace this completely.
Why don't you think the cookie mechanism is not helpful in client-side processing of RUM data?
Which security holes do you mean specifically?

Also, if possible, I would also think about "incomplete" loads and not just errors - e.g. I click on a link, the new page is rendering, the user navigates back before onload is fired. In many cases, if you're following best practices and doing async load / deferring your analytics, you'll miss this pageview (bounce) and RUM sample. To "fight this" some large sites still put their analytics in the head (ugh) - this is a big problem. I don't have hard numbers, but my intuition tells me that this may actually be a larger problem than failed connections.

[Alois]
Actually web site owners want to understand incomplete loads of page. If some ads for example did not load or the user cancelled after a certain time this is important information. If an analytics solution does not cover these cases it is incomplete. Pushing the content in the head therefore is not a problem it is necessary to get complete analytics data. In the above case this requires to registers onbeforeUnload event handlers. Web monitoring is not just about performance and people are willing to take some performance penalty for getting proper data.

#2 Instrumenting client-code (single page apps, etc).
This is completely orthogonal to error logging, and I think this should be split into a completely different discussion.

First, there is no reason why your script needs to block all others -- in this day and age of fighting against blocking resources, this is also a deal breaker. For example, consider GA, which has to deal with this very problem: the client code may be loaded first and it pay issue GA commands before GA is loaded. The solution is simple.. create a "_gaq" command queue object and push commands in there. Then, once GA loads, it inspects the object and does what it needs to do -- convention works.

[Alois]

This means that GA misses every event before the GA code is loaded. As you cannot guarantee when this is - because of network connectivity for example - you cannot make a clear statement which user interactions you can track and which not. We are constantly confronted with the claim of 100% coverage, this cannot be guaranteed by an approach as described above.

Even the command queue approach requires some JavaScript to be executed early on as well as page modification. The inability to modify the page is one of the key use cases for the spec. You are not addressing this with your approach.

Similarly, it seems like if the intent is to instrument JS frameworks like Angular, Ember, JQuery, then this is a conversation that should be had with all of them to see if there is some convention or shared API that can be used to emit common events: startup events, user initiated events, etc. This would solve the problem in a much better way: any vendor could then pop the events from the queue and apply its logic. No blocking resources, no need to hijack and rewrite client-libraries on the fly, and an API that everyone can use.

[Alois]

I agree that this is indeed helpful. We had conversations with some framework developers and added hooks. In general they are, however, hesitant as this increases the complexity of their frameworks. Events are not necessarily enough here, you have to wrap callback functions etc. to recconstruct the user behaviour from your monitoring data. An example is click-path analysis for single page apps as well as analysing functional errors in browsers.

</hand waving>

Q: Have you approached any of the large frameworks to discuss anything like this?

ig
The contents of this e-mail are intended for the named addressee only. It contains information that may be confidential. Unless you are the named addressee or an authorized designee, you may not copy or use it, or disclose it to anyone else. If you received it in error please notify us immediately and then destroy it. Compuware Austria GmbH (registration number FN 91482h) is a company registered in Vienna whose registered office is at 1120 Wien, Austria, Am Euro Platz 2 / Geb?ude G.

Received on Monday, 2 September 2013 12:32:04 UTC