GoranSMilovanovic added a comment.
@awight > This line can be removed, > event_user_id != 0 Indeed. > why would we need to check the historical column? If the user was classified as a bot at a time but now is not, shouldn't we respect the updated classification? Because in the data collection we want to be conservative and make sure that if we talk about human and not bot editors we certainly talk about human and not bot editors. To put it simply: it reduces the uncertainty in relation to our data. > The text says your filter will include the Item namespace (0), but the query only includes the talk pages: page_namespace = 1. Maybe this explains why there's such a low user count? I would have expected to see virtually all non-bot users who have edited wikidata. Good catch, thank you! I will re-run the ETL now. > Can you share more about the query that produced reactivations.csv? I can't tell from the information provided what counts as a "period of inactivity". R code. If you still would like me to share it with you I will open a Gerrit repo for this ticket. TASK DETAIL https://phabricator.wikimedia.org/T282563 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GoranSMilovanovic Cc: awight, WMDE-leszek, Manuel, Lydia_Pintscher, Aklapper, Jan_Dittrich, Invadibot, maantietaja, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
_______________________________________________ Wikidata-bugs mailing list -- [email protected] To unsubscribe send an email to [email protected]
