Re: [JSPreflight] - First Draft of JavaScript Preflight Injection online from Reitbauer, Alois on 2013-09-09 (public-web-perf@w3.org from September 2013)

From: Reitbauer, Alois <Alois.Reitbauer@compuware.com>
Date: Mon, 9 Sep 2013 09:45:13 +0000
To: Ilya Grigorik <igrigorik@google.com>
CC: Chase Douglas <chase@newrelic.com>, "public-web-perf@w3.org" <public-web-perf@w3.org>
Message-ID: <CE5360AD.2450A%alois.reitbauer@compuware.com>

There are two separate problems here. One is with META tags the other one with BASE Tags.
META Tags:
There are certain meta tags where JavaScript execution before the tag breaks the page. An example is <meta http-equiv="X-UA-Compatible" content="IE=7" />. If you execute JavaScript before, the page most likely breaks. The same is true for meta tags which define character sets.

BASE Tags:
A JavaScript block before a META tag in many cases leads to a situation where the base tag gets applied incorrectly. The effect is simply that the wrong URLs are used to load resources.

So if I understand you correctly, you want to inject a script to monkey-patch all other scripts? That would enable you to track RUM metrics? This seems like disaster waiting to happen.

[Alois] Your comment is out of context here. The questions was why it is hard to inject a script into a page at a specific position. I gave examples, when this is a problem. Don't get the disaster peace. Please, let's communicate like engineers.

This approach is also not applicable to the case where the page cannot be loaded.

Correct. I do agree that we need an error reporting mechanism, but (in my opinion), this proposal couples error reporting with a much heavier and unnecessary injection mechanism - we're adding another cookie mechanism, a full blocking JS resource, and a bunch of other complications. Which is why I'm suggesting a simpler, CSP-like implementation: on error, beacon some predefined error report to provided URL.

[Alois] The CSP case is very different from performance monitoring. How would this be able to cover single-page web apps?

Once again, I think this proposal mixes two completely distinct use cases:

#1 Error Logging
We want the client to beacon the NavTiming data if the navigation fails (e.g. DNS lookup succeeded, TCP handshake timed out). In this instance, simply beaconing the NavTiming object to some endpoint would be enough - caveat: if TCP failed, how do I even know where to beacon it? This is where we either get into some persistent settings (e.g. first you would have to have a successful navigation to get this instruction, and then persist it (ugh)), or we're into store-and-send-later type of semantics. In any case, this gets complicated quickly.. and I'm skeptical that new 'cookie-like' mechanisms are going to get us far -- this opens privacy holes, etc. There is just not much appetite for that nowadays. But.. this would be great to solve.

[Alois]
I do not think we need persistence in case of a TCP failure. This might just be your bad Starbuck WiFi. Additionally general broader connectivity failures for your datacenter will be cover with synthetic availability monitoring. RUM will never replace this completely.

Why not? It certainly could, perhaps even should. If some local provider has a misconfigured or compromised DNS that's breaking my site, it'd be nice to know about it. I can't rely on synthetic testing node in every network out there. (Just playing devil's advocate)

[Alois] So far we decided to leave this out of Navigation Error Logging version 1 to keep it simple. I however see your point.

Why don't you think the cookie mechanism is not helpful in client-side processing of RUM data?
Which security holes do you mean specifically?

It's yet another vector to fingerprint the user; it's a huge performance liability and a blatant SPOF, etc.

[Alois] The fingerprinting here is not more than what is possible today. This approach actually reduces SPOF as the page will still be loaded if the script fails to load, Today the whole page would break, so this is an improvement.

If such mechanism existed today, I would immediately put up a "best practice" to avoid it like the plague. As I said earlier, in the world where we're trying to reduce RTT's and latency is the bottleneck, these blocking scripts are an anti-pattern.

[Alois] So your problem is the blocking behaviour. What if we made the script non-blocking? This will come with certain drawbacks, but would at least be an improvement. As we can assume that all other Web Perf. Standards will be available in these browser, some of the early injection requirements will go away.

And I'm still of the mind that you should be able to achieve what you're after without making them be a blocking resource. Just have the site include your script, and do your stuff.

[Alois] The problem is that including a script in a page you do not control is not possible. Monitoring of Third Party apps is not possible with an add-a-script approach.

Also, if possible, I would also think about "incomplete" loads and not just errors - e.g. I click on a link, the new page is rendering, the user navigates back before onload is fired. In many cases, if you're following best practices and doing async load / deferring your analytics, you'll miss this pageview (bounce) and RUM sample. To "fight this" some large sites still put their analytics in the head (ugh) - this is a big problem. I don't have hard numbers, but my intuition tells me that this may actually be a larger problem than failed connections.

[Alois]
Actually web site owners want to understand incomplete loads of page. If some ads for example did not load or the user cancelled after a certain time this is important information. If an analytics solution does not cover these cases it is incomplete.

I think we're saying the same thing here.

[Alois] At least some agreement ;-)

Pushing the content in the head therefore is not a problem it is necessary to get complete analytics data.

It is a problem. Analytics shouldn't block rendering. The fact that we can't achieve this today is a bug - that's what we need to solve, and without introducing more blocking behaviors.

[Alois] So what is your proposal for this bug? I am happy to learn about alternatives.

#2 Instrumenting client-code (single page apps, etc).
This is completely orthogonal to error logging, and I think this should be split into a completely different discussion.

First, there is no reason why your script needs to block all others -- in this day and age of fighting against blocking resources, this is also a deal breaker. For example, consider GA, which has to deal with this very problem: the client code may be loaded first and it pay issue GA commands before GA is loaded. The solution is simple.. create a "_gaq" command queue object and push commands in there. Then, once GA loads, it inspects the object and does what it needs to do -- convention works.

[Alois]

This means that GA misses every event before the GA code is loaded. As you cannot guarantee when this is - because of network connectivity for example - you cannot make a clear statement which user interactions you can track and which not. We are constantly confronted with the claim of 100% coverage, this cannot be guaranteed by an approach as described above.

No, it does not. It's a simple convention:

var _gaq = _gaq || [];
_gaq.push(['_trackPageview']);

You don't need the script to be loaded to register the pageview event. The GA script, once loaded looks for _gaq and does its work. This works today just fine: https://developers.google.com/analytics/devguides/collection/gajs/

The only caveat is the case where ga.js is not loaded in-time, and that's the case we're discussing above.

[Alois] This is what I meant.

Even the command queue approach requires some JavaScript to be executed early on as well as page modification. The inability to modify the page is one of the key use cases for the spec. You are not addressing this with your approach.

No.. All you need to do is check if _gaq already exists, and if not, just initialize it to an empty array.

[Alois} You still need to add:

var _gaq = _gaq || [];
_gaq.push(['_trackPageview']);

How do you do this if you cannot modify the page?

I agree that this is indeed helpful. We had conversations with some framework developers and added hooks. In general they are, however, hesitant as this increases the complexity of their frameworks. Events are not necessarily enough here, you have to wrap callback functions etc. to recconstruct the user behaviour from your monitoring data. An example is click-path analysis for single page apps as well as analysing functional errors in browsers.

Ok, maybe I'm missing the point, but why is this impossible to achieve without a blocking resource?

[Alois] The main reason is, that framework functions need to be patches before there are listeners attached. This mean the earlier the better.

As a side note: If you want to discuss it we can also have a more detailed discussion over Skype etc. This seems to be more efficient than sending emails back and forth.

// Alois
The contents of this e-mail are intended for the named addressee only. It contains information that may be confidential. Unless you are the named addressee or an authorized designee, you may not copy or use it, or disclose it to anyone else. If you received it in error please notify us immediately and then destroy it. Compuware Austria GmbH (registration number FN 91482h) is a company registered in Vienna whose registered office is at 1120 Wien, Austria, Am Euro Platz 2 / Geb?ude G.

Received on Monday, 9 September 2013 09:45:46 UTC