GoranSMilovanovic added a comment.
@Jan_Dittrich **Do we really find a Lindy effect in the Wikidata acount age
distribution?**
**Assumption.** As demonstrated in Eliazar, Iddo (November 2017). "Lindy's
Law". Physica A: Statistical Mechanics and Its Applications. 486: 797–805, if
the Lindy effect holds than the Survival function of the account age is Pareto.
So, we need to test if the Wikidata account age follows a power-law or not.
Now, this is a bit tricky, so let's go one step at the time:
- the data are the frequencies of Wikidata account ages;
- the age of the account is the number of months since the registration until
the first sequence of five inactive months (when we pronounce an editor
officially inactive by convention)
- Bots are filtered out in the ETL phase;
- following a power-law estimation in R from {poweRlaw}, documentation:
https://cran.r-project.org/web/packages/poweRlaw/index.html, essentially based
on power-law estimates derived in Clauset, Shalizi & Newman (2007). "Power-law
distributions in empirical data": https://arxiv.org/pdf/0706.1062.pdf
- the `x_min` of the account age is estimated to be `153` with an `alpha` of
`2.217158`, indicating a power-law behavior with the second and higher-order
moments divergence (also see
Gillespie (2017). Fitting Heavy Tailed Distributions: The poweRlaw Package:
https://cran.r-project.org/web/packages/poweRlaw/vignettes/d_jss_paper.pdf,
page 3);
- if the `x_min` is set to the de facto minimum of the account age (which is
`69`; no `x_min` estimation), then we have a power-law behavior with an
estimate of `alpha` found at `1.626341` - a power-law behavior with all moments
diverging.
**However**, following the recommendations of the authors of {poweRlaw}, the
boostrap analysis shows that in neither of the two cases the power-law is
really present (see the Hypothesis Testing framework implemented in {poweRlaw},
2. Examples using the poweRlaw package:
https://cran.r-project.org/web/packages/poweRlaw/vignettes/b_powerlaw_examples.pdf,
pages 4 - 5).
**So, it does not seem to be a case of the Lindy effect after all.** The code
will be shared on Gerrit soon and referenced from Phab.
I would also feel at least a bit more confident than I am now if @MGerlach
could find some time to take a look at the data.
It is methodologically problematic, or at least in my viewpoint, to try to
establish whether the power-law (and thus Lindy) holds for the total number of
//active months// (obtained by neglecting all inactive months in the editor's
revision history). However, we can try.
TASK DETAIL
https://phabricator.wikimedia.org/T282563
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: GoranSMilovanovic
Cc: Pablo, Mohammed_Sadat_WMDE, Tobi_WMDE_SW, MGerlach, awight, WMDE-leszek,
Manuel, Lydia_Pintscher, Aklapper, Jan_Dittrich, Invadibot, maantietaja,
Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer,
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
_______________________________________________
Wikidata-bugs mailing list -- [email protected]
To unsubscribe send an email to [email protected]