GoranSMilovanovic added a comment.
@Jan_Dittrich @awight In reference to T282563#7186386 <https://phabricator.wikimedia.org/T282563#7186386> and T282563#7226336 <https://phabricator.wikimedia.org/T282563#7226336>: - I have used a fresh dataset, relying on the `2021-06` snapshot of the `wmf.mediawiki_history table`; - the results are fully replicated (in qualitative sense, of course); - I have also filtered out all editors who have less than six (6) months of presence in Wikidata, simply because they never really had a chance to leave (where "left Wikidata" is defined as five (5) months of inactivity). **The Lindy Effect** I have used several different operational definitions of the "length of past activity" to illustrate the Lindy Effect in Wikidata editing. **A. The total number of active months in editor's revision history** So, and editor can be active and inactive now and than; this measure of "length of past activity" is defined as the count of months in which and editor was active given the whole course of their presence in Wikidata since registration. The vertical axis represents the probability to leave Wikidata given the count of active months. F34570577: 01_LindyA.png <https://phabricator.wikimedia.org/F34570577> **B. The probability of an active month** The previous measure could be criticized on the grounds that it is not the same if (a) someone has ten active months while being registered a year ago and if (b) someone has ten active months while being registered three years ago. I have turned the absolute counts of active months per editor into proportions of their total stay in Wikidata since registration (effectively calculating the probability of any given month in the editor's revision history being an active month). The horizontal axis is the probability to have an active month in course of one's revision history, binned into 100 intervals. The vertical axis represents the probability to leave Wikidata given the count of active months. F34570592: 02_LindyA.png <https://phabricator.wikimedia.org/F34570592> **C. The age of the account** This is simple yet probably inconclusive in respect to the Lindy Effect itself: how old is their account vs what is the probability that they have left Wikidata (i.e. are now inactive for five months at least)? F34570590: 03_LindyA.png <https://phabricator.wikimedia.org/F34570590> **The distribution of the number of revisions vs left or did not left Wikidata** The horizontal axis represents the log of the number of revisions, while the vertical axis is probability density. Obviously, those who are still with us are those who made more edits until now - as expected. F34570594: 04_RevisionsVSLeftWikidata.png <https://phabricator.wikimedia.org/F34570594> Here are the descriptive statistics on revisions: **Left Wikidata:** Min. 1st Qu. Median Mean 3rd Qu. Max. 1 1 2 203 7 5891740 **Active on Wikidata:** Min. 1st Qu. Median Mean 3rd Qu. Max. 2 19 108 15268 720 31003903 **The distribution of the length of inactivity periods vs left or did not left Wikidata** A single editor can have several periods of inactivity of varying length in months. I have analyzed the distribution of both mean and median length of inactivity periods per user, grouped according to whether they are still editing or not. Mean length of inactivity periods first: F34570597: 05_MeanLengthInactiveVSLeftWikidata.png <https://phabricator.wikimedia.org/F34570597> Obviously, the editors who are still active typically have way less prolonged sequences of inactive months. The descriptive statistics on mean length of inactivity periods: **Left Wikidata:** Min. 1st Qu. Median Mean 3rd Qu. Max. 1.429 14.500 30.000 37.185 56.000 105.000 **Active on Wikidata:** Min. 1st Qu. Median Mean 3rd Qu. Max. NA's 1.000 1.875 3.000 4.942 5.600 77.000 88 **N.B.** `NA's` represent those editors who did not have a single inactive month in their revision history. And now for the median length of inactivity periods: F34570602: 06_MedianLengthInactiveVSLeftWikidata.png <https://phabricator.wikimedia.org/F34570602> The descriptive statistics: **Left Wikidata:** Min. 1st Qu. Median Mean 3rd Qu. Max. 1.00 13.00 30.00 36.52 56.00 105.00 **Active on Wikidata:** Min. 1st Qu. Median Mean 3rd Qu. Max. NA's 1.000 1.000 2.000 3.609 4.000 77.000 88 **N.B.** `NA's` represent those editors who did not have a single inactive month in their revision history. My present conclusions: - The Lindy Effect holds in Wikidata editing: the lengthier the past editing behavior higher the chances that it will persist; - As expected, currently active Wikidata editors made more revisions in the past in comparison to those who are now inactive; - Currently active Wikidata editors have less prolonged periods of inactivity on the average (and measured in months) relative to those who are now inactive. **What is missing from this analysis?** This is missing from T282563#7186386 <https://phabricator.wikimedia.org/T282563#7186386>: > ... user behavior on talk pages because it takes another ETL run through the `wmf.mediawiki_history` table; I will try to produce that dataset tonight, join with the existing data, and report upon it. Sincerely: I do not expected any other finding to emerge then that active editors make more revisions on talk pages. @Jan_Dittrich I did not find enough time to focus on all the papers that you have shared (and for which I am thankful). I will focus on them tonight, as much as I can (there are other tickets calling for my attention too), and then get in touch on our idea to publish this finding. Thank you a very inspirational question that you have raised here in relation to the Lindy Effect! TASK DETAIL https://phabricator.wikimedia.org/T282563 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GoranSMilovanovic Cc: Pablo, Mohammed_Sadat_WMDE, Tobi_WMDE_SW, MGerlach, awight, WMDE-leszek, Manuel, Lydia_Pintscher, Aklapper, Jan_Dittrich, Invadibot, maantietaja, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
_______________________________________________ Wikidata-bugs mailing list -- [email protected] To unsubscribe send an email to [email protected]
