Gehel added subscribers: ema, BBlack.
Gehel added a comment.

For some more context:

LDF is a way to cheaply get large lists of triples from WDQS, and displace some logic on the clients. Retrieving this list is done page by page. We already have use cases for this. The iteration order is just following the underlying index, which might be different on each node. So specific pages on returned by different nodes might have different content. Adding a sorting step would allow the order to be consistent between nodes, but would increase the cost enough that it breaks the main point of LDF (cheaply retrieve a list of triples).

Potential solutions:

client affinity

The same client always ending up on the same WDQS node would ensure that this client will get a consistent ordering. That consistent ordering would get cached at varnish level and provide a consistent answer to the next client. This breaks down if there is a refresh by another client doing the same query at the time of invalidation. LVS is able to do source hashing scheduling (see LVS docs or T151971), but in our case the source IP is Varnish. We could remove the LVS in front of WDQS and use Varnish to do balancing, but this is exactly the opposite of the current effort to standardise all our services on the same model.

single LDF server

We can route all LDF requests to a single server, with a fallback mechanism to route traffic to another server in case the first one is down. This is a not scalable option. And we don't have anything in place (AFAIK) to manage an automatic fallback (it does not look like LVS has a scheduler that would work in this scenario).

make any WDQS node return consistent pages

As stated above, sorting results before paging would solve the issue, but is probably too expensive to consider (result sets are expected to be large). The LDF implementation that we use does not seem to support this. It might be possible to configure a different indexing strategy, with consistent iteration order, but that's an unknown. Basically WDQS is stateless (as seen from a client) but accidentally exposes internal state in a subtle way.

At this point, we are mostly out of ideas. @BBlack, @ema if you have any idea, they would be welcomed!


TASK DETAIL
https://phabricator.wikimedia.org/T159574

EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Smalyshev, Gehel
Cc: BBlack, ema, Aklapper, Gehel, Smalyshev, EBjune, merbst, Avner, debt, D3r1ck01, Jonas, FloNight, Xmlizer, Izno, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, Mbch331
_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to