GoranSMilovanovic added a comment.

  @Jan_Dittrich **Do we really find a Lindy effect in the Wikidata acount age 
distribution?**
  
  **Assumption.** As demonstrated in Eliazar, Iddo (November 2017). "Lindy's 
Law". Physica A: Statistical Mechanics and Its Applications. 486: 797–805, if 
the Lindy effect holds than the Survival function of the account age is Pareto. 
So, we need to test if the Wikidata account age follows a power-law or not.
  
  Now, this is a bit tricky, so let's go one step at the time:
  
  - the data are the frequencies of Wikidata account ages;
  
  - the age of the account is the number of months since the registration until 
the first sequence of five inactive months (when we pronounce an editor 
officially inactive by convention)
  
  - Bots are filtered out in the ETL phase;
  
  - following a power-law estimation in R from {poweRlaw}, documentation: 
https://cran.r-project.org/web/packages/poweRlaw/index.html, essentially based 
on power-law estimates derived in Clauset, Shalizi & Newman (2007). "Power-law 
distributions in empirical data": https://arxiv.org/pdf/0706.1062.pdf
  
  - the `x_min` of the account age is estimated to be `153` with an `alpha` of 
`2.217158`, indicating a power-law behavior with the second and higher-order 
moments divergence (also see
  
  Gillespie (2017). Fitting Heavy Tailed Distributions: The poweRlaw Package: 
https://cran.r-project.org/web/packages/poweRlaw/vignettes/d_jss_paper.pdf, 
page 3);
  
  - if the `x_min` is set to the de facto minimum of the account age (which is 
`69`; no `x_min` estimation), then we have a power-law behavior with an 
estimate of `alpha` found at `1.626341` - a power-law behavior with all moments 
diverging.
  
  **However**, following the recommendations of the authors of {poweRlaw}, the 
boostrap analysis shows that in neither of the two cases the power-law is 
really present (see the Hypothesis Testing framework implemented in {poweRlaw}, 
2. Examples using the poweRlaw package: 
https://cran.r-project.org/web/packages/poweRlaw/vignettes/b_powerlaw_examples.pdf,
 pages 4 - 5).
  
  **So, it does not seem to be a case of the Lindy effect after all.** The code 
will be shared on Gerrit soon and referenced from Phab.
  
  I would also feel at least a bit more confident than I am now if @MGerlach 
could find some time to take a look at the data.
  
  It is methodologically problematic, or at least in my viewpoint, to try to 
establish whether the power-law (and thus Lindy) holds for the total number of 
//active months// (obtained by neglecting all inactive months in the editor's 
revision history). However, we can try.

TASK DETAIL
  https://phabricator.wikimedia.org/T282563

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GoranSMilovanovic
Cc: Pablo, Mohammed_Sadat_WMDE, Tobi_WMDE_SW, MGerlach, awight, WMDE-leszek, 
Manuel, Lydia_Pintscher, Aklapper, Jan_Dittrich, Invadibot, maantietaja, 
Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
_______________________________________________
Wikidata-bugs mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to