On Fri, Feb 7, 2020 at 2:54 PM Marco Neumann <[email protected]> wrote:
> thank you Guillaume, when do you expect a public update on the security > incident [1]? Is any of our personal and private data (email, password etc) > affected? > It should be made public in the next few days. I'm not going to go into any more details until this is made public, but overall, don't worry too much. > best, > Marco > > [1] https://phabricator.wikimedia.org/T241410 > > On Fri, Feb 7, 2020 at 1:33 PM Guillaume Lederrey <[email protected]> > wrote: > >> Hello all! >> >> First of all, my apologies for the long silence. We need to do better in >> terms of communication. I'll try my best to send a monthly update from now >> on. Keep me honest, remind me if I fail. >> >> First, we had a security incident at the end of December, which forced us >> to move from our Kafka based update stream back to the RecentChanges >> poller. The details are still private, but you will be able to get the full >> story soon on phabricator [1]. The RecentChange poller is less efficient >> and this is leading to high update lag again (just when we thought we had >> things slightly under control). We tried to mitigate this by improving the >> parallelism in the updater [2], which helped a bit, but not as much as we >> need. >> >> Another attempt to get update lag under control is to apply back pressure >> on edits, by adding the WDQS update lag to the Wikdiata maxlag [6]. This is >> obviously less than ideal (at least as long as WDQS updates are lagging as >> often as they are), but does allow the service to recover from time to >> time. We probably need to iterate on this, provide better granularity, >> differentiate better between operations that have an impact on update lag >> and those which don't. >> >> On the slightly better news side, we now have a much better understanding >> of the update process and of its shortcomings. The current process does a >> full diff between each updated entity and what we have in blazegraph. Even >> if a single triple needs to change, we still read tons of data from >> Blazegraph. While this approach is simple and robust, it is obviously not >> efficient. We need to rewrite the updater to take a more event streaming / >> reactive approach, and only work on the actual changes. This is a big chunk >> of work, almost a complete rewrite of the updater, and we need a new >> solution to stream changes with guaranteed ordering (something that our >> kafka queues don't offer). This is where we are focusing our energy at the >> moment, this looks like the best option to improve the situation in the >> medium term. This change will probably have some functional impacts [3]. >> >> Some misc things: >> >> We have done some work to get better metrics and better understanding of >> what's going on. From collecting more metrics during the update [4] to >> loading RDF dumps into Hadoop for further analysis [5] and better logging >> of SPARQL requests. We are not focusing on this analysis until we are in a >> more stable situation regarding update lag. >> >> We have a new team member working on WDQS. He is still ramping up, but we >> should have a bit more capacity from now on. >> >> Some longer term thoughts: >> >> Keeping all of Wikidata in a single graph is most probably not going to >> work long term. We have not found examples of public SPARQL endpoints with >> > 10 B triples and there is probably a good reason for that. We will >> probably need to split the graphs at some point. We don't know how yet >> (that's why we loaded the dumps into Hadoop, that might give us some more >> insight). We might expose a subgraph with only truthy statements. Or have >> language specific graphs, with only language specific labels. Or something >> completely different. >> >> Keeping WDQS / Wikidata as open as they are at the moment might not be >> possible in the long term. We need to think if / how we want to implement >> some form of authentication and quotas. Potentially increasing quotas for >> some use cases, but keeping them strict for others. Again, we don't know >> how this will look like, but we're thinking about it. >> >> What you can do to help: >> >> Again, we're not sure. Of course, reducing the load (both in terms of >> edits on Wikidata and of reads on WDQS) will help. But not using those >> services makes them useless. >> >> We suspect that some use cases are more expensive than others (a single >> property change to a large entity will require a comparatively insane >> amount of work to update it on the WDQS side). We'd like to have real data >> on the cost of various operations, but we only have guesses at this point. >> >> If you've read this far, thanks a lot for your engagement! >> >> Have fun! >> >> Guillaume >> >> >> >> >> [1] https://phabricator.wikimedia.org/T241410 >> [2] https://phabricator.wikimedia.org/T238045 >> [3] https://phabricator.wikimedia.org/T244341 >> [4] https://phabricator.wikimedia.org/T239908 >> [5] https://phabricator.wikimedia.org/T241125 >> [6] https://phabricator.wikimedia.org/T221774 >> >> -- >> Guillaume Lederrey >> Engineering Manager, Search Platform >> Wikimedia Foundation >> UTC+1 / CET >> _______________________________________________ >> Wikidata mailing list >> [email protected] >> https://lists.wikimedia.org/mailman/listinfo/wikidata >> > > > -- > > > --- > Marco Neumann > KONA > > _______________________________________________ > Wikidata mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/wikidata > -- Guillaume Lederrey Engineering Manager, Search Platform Wikimedia Foundation UTC+1 / CET
_______________________________________________ Wikidata mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata
