Re: [whatwg] Script preloading

Kyle Simpson Tue, 09 Jul 2013 13:42:51 -0700

This is a long and complicated topic with lots of "history". Please bear with 
the length of my reply.



> It seems that people want something that:
> 
> - Lets them download scripts but not execute them until needed.
> - Lets them have multiple interdependent scripts and have the browser
>   manage their ordering.
> - Do all this without having to modify existing scripts.

I think it's important to note that the primary motivation here is performance. 
If all I cared about was serial loading (and the performance of that was 
irrelevant), I could, with today's mechanisms, load one script at a time, only 
when I was ready to execute it, and if there were multiple scripts, do so in 
the "correct" order.

The fly in in the ointment is if I need to load multiple scripts at once, 
specifically if those come from different locations (one is jQuery from the 
CDN, another is a plugin from my server), and for performance reasons I want 
these scripts to load in parallel, but their execution order still matters.

Now, if THAT was the only concern, the "new" (well, 2 years old now) 
`async=false` (or, as I call it, "ordered async") mechanism would be enough. 
But it's not.

Because there's only one "ordered async" queue, which means all such scripts 
have to go into the same bucket, and it's all or nothing for preserving 
execution order. What if I want to load jQuery, then I want (in ASAP order) for 
4 independent plugins to run?

"Ordered async" will run jQuery, then each plugin in request order, which might 
make one small plugin wait much longer than it needs to for an earlier plugin. 
Here, it's a performance issue because a plugin like a calendar widget might 
not appear and render as early as it could otherwise, because some other 
non-related plugin is coming from a slow location but was requested first.

Very quickly, "ordered async" starts to not be sufficient for various 
use-cases. It's like the 75% solution, but the 25% are still a concern, 
performance wise.



> I must admit to not really understanding these requirements (script 
> execution can be made more or less free if they are designed to just 
> expose some functions, for example, and it's trivial to set up a script 
> dependency mechanism for scripts to run each other in order, and there's 
> no reason browsers can't parse scripts off the main thread, etc). But 
> since everyone else seems to think these are issues, let's ignore that.

In an ideal world, we could tell everyone "hey, there's a new mandate, rewrite 
all your code by XYZ deadline. Kthxbai." That's not how it works. The fact is 
that the web platform and the browsers move MUCH quicker than legacy content.

This is, and always has been, a question of if you think we should just wait 
around for years and years until every script we care about (but don't control 
since we didn't author it) gets rewritten, and while that's happening, web 
performance in this area just can't be addressed? I maintain still that we can 
and should fix the platform so we have more options than just "tough luck".



> The proposals I've seen so far for extending the spec's script preloading 
> mechanisms fall into two categories:
> 
> - provide some more control over the mechanisms already there, e.g. 
>   firing events at various times, adding attributes to make the script 
>   loading algorithm work differently, or adding methods to trigger 
>   particular parts of the algorithm under author control.
> 
> - provide a layer above the current algorithm that provides strong 
>   semantics, but that doesn't have much impact on the loading algorithm 
>   itself.
> 
> I'm very hesitant to do the first of these, because the algorithm is _so_ 
> complicated that adding anything else to it is just going to result in 
> bugs in browsers. There comes a point where an algorithm just becomes so 
> hard to accurately test that it's a lost cause.

Can you please elaborate on how either of the two prominent proposals that 
Nicholas Zakas and I detailed years ago here are insufficient in that they fall 
into your first category?

http://wiki.whatwg.org/wiki/Script_Execution_Control

My proposal was to standardize what IE4-10 did, which is to start loading a 
script even if it's not in the DOM, but not execute it until it's in the DOM. 
Then, you monitor an event to know if one or more scripts have finished this 
"preloading", and then you can decide if and when and in what order to add the 
corresponding <script> elements to the DOM to allow "execution" to proceed.

The spec already suggests the core of this:

"For performance reasons, user agents may start fetching the script as soon as 
the attribute is set, instead, in the hope that the element will be inserted 
into the document. Either way, once the element is inserted into the document, 
the load must have started. If the UA performs such prefetching, but the 
element is never inserted in the document, or the src attribute is dynamically 
changed, then the user agent will not execute the script, and the fetching 
process will have been effectively wasted."

How does standardizing that suggestion (turning it from a MAY into a MUST) 
further complicate the loading algorithm, especially since there's over a 
decade of proof in IE that it works and works fine, and it's already detailed 
as a suggestion in the spec?

True, the spec doesn't suggest the "event to known when preloading completes" 
part. IE did this with `readyState == "loaded"` and `onreadystatechange`. That 
part would need to be figured out. But again, we had prior art as PoC from IE.

-----

Setting aside IE's non-standard event mechanism, Nicholas' proposal was also 
perfectly sensible, and pretty simple. He suggested the same mechanism for 
execution (adding a <script> to the DOM), but suggested a `preload` attribute 
on the element to prevent auto-execution, and a `onpreload` event to notify you 
of loading-complete.

In neither case do I see how this unduly complicates the loading algorithm? It 
ONLY pauses it between load and execute, under certain conditions. It doesn't 
change any of the order of the algorithm.




> 1. Add a "dependencies" attribute to <script> that can point to other 
>   scripts to indicate that execution of this script should be delayed
>   until all other scripts that are (a) earlier in the tree order and (b) 
>   identified by this attribute have executed.
> 
>     <script id="jquery" src="jquery.js" async></script>
>     <script id="shims" src="shims.js" async></script>
>     <script dependencies="shims jquery" src="myscript.js" async></script>

I believe this solution to be infeasible for a number of important scenarios.

For instance, content-management-systems (through plugins) load different items 
into a page independently of other plugins. Those plugins have no knowledge of 
the complete content/markup-structure of the page, and so they would be unable 
to reliably create <script> elements with unique ID's, without just guessing by 
creating really obscure ID's. Not perfect/perfectly reliable, and certainly 
ugly and more work for them to do. Every plugin would have to come up with its 
own ID-generation mechanisms. Asking for inconsistencies/problems.

Moreover, this ignores some use-cases for "script preloading" which are 
basically speculative preloading. I may want to load several plugins, but not 
execute any of them, and then only use one of them depending on what the user 
does.

So, the real need here is NOT just to chain a set of scripts together, but to 
put the author in control of pausing a script between load and execute, and 
then they get to decide when/if they unpause a script.




> 2. Add an "whenneeded" boolean content attribute, a "markNeeded()" method,
>   and an internal "is-needed flag" (initially false) to the <script> 
>   element. When a script is about to execute, if its whenneeded="" 
>   attribute is set, but its "is-needed" flag is not, then delay 
>   execution. Calling markNeeded() on a script that has a whenneeded 
>   boolean but that has not executed yet first causes the markNeeded() 
>   method on all the script's dependencies to be called, and then causes
>   this script to become ready to execute.

From where I stand, this solution is partially workable, sort of.

It doesn't solve the "I need an event to know when load is finished". If I'm 
waiting for a bunch of stuff to load before asking any of them to execute (in 
the order I want), I need a way to know that each of them are in fact fully 
loaded.

Secondly, it seems like a bit more complex version of Nicholas' proposal, with 
just different names. Not sure why the complecting of API with (mostly) his 
same semantics.

I don't actually mind if the "way to get it to execute" constitutes inserting a 
previously not-inserted element into the DOM, or if it consists of calling some 
new method on the already-inserted element. Those have basically the same 
effect.

Your proposed API is more complicated, but only a little bit so. Not a deal 
breaker. It does override, to an extent, the above quoted part of the spec 
which already suggests that a/the way to separate load and execute is insertion 
in the DOM or not.

I would still vote for an option that's the least different from what we 
already have.

But I'd settle for anything, no matter how complex, as long as it actually 
solves the many use cases. Your proposed option has potential, as long as the 
missing "event" part is addressed.



> Is there a need for delaying the download of a script as well? (If so, we 
> could change whenneeded="" to have values, like whenneeded="execute" vs 
> whenneeded="download" or something.)

I think there's another effort elsewhere under the spec umbrella that's 
discussing some attributes for more accurate deferring of resource loading (not 
just scripts). No need to conflate these efforts, IMO. The effort here 
presently is to load something (many things) as quickly as possible, not put 
off loading until "later".



> Is there something this doesn't handle which it would need to handle?

Just one final side note on the above linked-to proposals (Zakas's and mine). 
Over 2 years ago, I implemented feature-detects in LABjs script loader for both 
of those proposals. Of course, the `readyState` one actually works in IE4-10 
and works beautifully I might add. In head-to-head loading tests I've done from 
time to time, the IE "real preloading" mechanism often beats out the 
good-but-not-great "ordered async" of the other modern browsers.

The `preload` one doesn't currently work of course (it's just dormant code for 
now), but I thought it to be a sufficiently good enough proposal, and likely 
enough to eventually happen, that I put in those few lines of code to LABjs, as 
speculative future-proofing.

Now, at this point, there's an awful lot of sites on the web using that version 
of LABjs. My most recent estimates are that LABjs has been downloaded/deployed 
on the order of 10's of thousands of sites over the last several years.

IF by some magical alignment of the stars, either of those proposals did 
finally get standardized, LABjs wouldn't have to change at all, and none of 
those sites would have to do anything different. They'd just automatically 
start getting better loading performance (on average).





--Kyle

Re: [whatwg] Script preloading

Reply via email to