Hi everyone,

TL;DR: You might not have to wait very long on CI if your patchset has
already passed its tests in a prior run with the same exact
dependencies and CI configuration.

At our WMF Developer Experience offsite back in December 2024, the
Release Engineering team discussed a few different ways we might
reduce continuous integration wait times for MediaWiki contributors.

One hypothesis was that there was a significant number of cases where
automated testing was repeated against the exact same setup: the same
MediaWiki patchset, the same dependency versions, and the same CI
configuration.

This could happen during retest scenarios, for example, or for
backports to release branches. If correct, we might be able to safely
skip redundant test execution and save developers and deployers some
serious wait time.

So we ran an experiment to prove _or disprove_ our hypothesis. If we
were right, maybe we could skip execution during those scenarios. If
we were wrong, there would be no reason to complicate our CI jobs
further.

We first rolled out a partial "success caching" implementation in
Quibble[1] that computed a SHA256 digest to represent the uniqueness
of the test run and stuck it in a memcached instance at the end of a
successful run.

The digest was computed from:

 1. The job name (a good and easy proxy for overall CI setup).
 2. The `HEAD^{tree}` of each of the sorted Git repos under test
(core, extensions, skins, etc).

(Note the `HEAD^{tree}` is used over simply `HEAD` because it more
accurately represents the working tree on disk after you checkout a
commit, and the `HEAD` commit is almost never unique due to our gating
system creating temporary merge commits, etc.)

Quibble would then check its cache on subsequent runs for an identical
digest/key, and report to the console if it found a match. We let this
run for a couple of weeks, scraping the Jenkins logs, and then did
some reporting on it.

Our hypothesis was proven to be correct. Redundant test execution was
occuring, and more so for the gate-and-submit pipelines that test
mainline bound changes, and even more often in pipelines testing
changes against release branches (weekly train branches and long-term
release).

You can see my summary on the task for details[2], but the most
striking numbers were:

 - 6.4% of test runs in gate-and-submit (merges to mainline branches)
were redundant
 - 28.3% of test runs in gate-and-submit-wmf (merges to weekly release
branches) were redundant
 - 163 _hours_ of CI wall time could have been saved had we skipped
execution of redundant tests

Naturally, with such encouraging numbers, we went ahead with the final
implementation. Quibble will now exit early and successfully if:

 1. The patch under test has not changed from a previously successful run.
 2. The extensions/skins/vendor dependencies have not changed.
 3. The setup for MediaWiki and its testsuite has not changed
(database type, vendor vs. composer usage, etc).

Hopefully this leads to some pleasant surprises for folks when waiting
on CI, especially during backport deployment windows.

Thanks to everyone in Release Engineering for collaborating on the
idea, and a special thanks to Antoine Musso for thinking through the
details with me and for reviewing my Python code!

Please reply or reach out on IRC (#wikimedia-releng) if you want to
know more about it.

To pleasant surprises!

Cheers,
Dan

[1]: https://www.mediawiki.org/wiki/Continuous_integration/Quibble
[2]: https://phabricator.wikimedia.org/T383243#10584349

-- 
Dan Duvall
Staff Software Engineer, Release Engineering
Wikimedia Foundation
_______________________________________________
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/

Reply via email to