I'd like to share a use case and problem we have at Wikipedia with localStorage.

The MediaWiki software (which Wikipedia runs on) uses a framework called 
ResourceLoader for bundling and delivering modules to the client. [1][2]

Last year it was changed to make use of localStorage in addition to optimised 
HTTP 304 handling. Mainly because of two issues we found:

1. Batching is bad for incremental updates.

We combine requests for multiple modules in predictable batches. This means 
usually only 1 or 2 actual HTTP request are made for the main payload. However, 
when one of those module change in a deployment, that batch would no longer be 
same and have to be invalidated in its entirety. Causing the user to have to 
re-download all modules in the same batch as well. Extracting the payload 
client-side into individual modules put in LocalStorage allowed us to only 
re-request the module that changed from the server and evaluate the rest from 
localStorage (and update the entry afterward). This reduced bandwidth 
significantly and improved page load times overall.

I imagine HTTP2 might make it appropriate to phase out batches and just request 
modules individually (always) and let the network layer do the combining and 
separated caching in a more natural way.

2. HTTP 304 hits are not free.

We found that loading JS/CSS from LocalStorage was faster than hitting a HTTP 
304. Making enough difference to justify this change.

So that went wrong?

Well. The nice thing about regular 304 caching is that as developers we're not 
worried about the size restriction of the store. Whether the browser limits 
this or not. Whether it's FIFO, LRU or just unlimited isn't an immediate 
user-visible concern (it probably should be, but that's for another 
discussion). When we started using localStorage, users that once visited pages 
with lots of functionality enabled found themselves having a full localStorage.

This caused other - more essential - functionality to no longer work. E.g. 
Logic that previously used cookies to store small state values that were moved 
to localStorage (to reduce network overhead and because it made semantic 
sense), such as "Boolean : Hide fundraising banner" or "Last 10 autocomplete 
values" – no longer worked as localStorage was filled up with our faux HTTP 
cache for ResourceLoader. Which is unfortunate, since the module store could 
easily fall back to requesting from HTTP (and usually hit 304) whereas those 
state values would never save and cause user-visible problems and functionality 
not working as expected.

We're working around it in different ways (some things resorted to cookies) but 
are still stalled on a long-term solution for this problem. We're considering 
to move our module store from localStorage to IndexedDB as that's not being 
used at the moment. It would provide the same separation as 
cookies/localStorage. In that localStorage would keep working even if IndexDB 
was full.

Some thoughts:

* A way to know if a url is cached or not (e.g. know whether a url will hit 
HTTP 304) without making the request.
* A way to prioritise which entries should be kept in localStorage and allow 
for low-prio entries to be evicted if short on space.
* A way to know how much localStorage is available in total.
* Perhaps a way to create a limited store within localStorage or IndexDB that 
has limited/restricted capacity (with some unique identifier, capacity 
percentage-based, or a min/max byte size?).
* A separate store for caching HTTP resources (the Service Worker's Cache API?)

— Timo Tijhof
Software Engineer
Wikimedia Foundation

PS: Sorry if this is the wrong avenue for this type of feedback. Thanks in 
advance.

[1] https://en.wikipedia.org/wiki/MediaWiki
[2] https://www.mediawiki.org/wiki/ResourceLoader/Features

On 13 Mar 2015, at 12:50, Anne van Kesteren <ann...@annevk.nl> wrote:

> A big gap with native is dependable storage for applications. I
> started sketching the problem space on this wiki page:
> 
>  https://wiki.whatwg.org/wiki/Storage
> 
> Feedback I got is that having some kind of allotted quota is useful
> for applications. That way they know how much they can put away.
> However, this clashes a bit with offering something that is
> competitive with native.
> 
> We can't really ask the user to divide up their storage. And yet when
> the user asks an application to store e.g. a whole bunch of music
> offline we don't really want the user agent to get in the way if the
> user already granted persistence.
> 
> 
> -- 
> https://annevankesteren.nl/

Reply via email to