chelsyx added a comment. |
Hi @Nuria , the numbers I showed above are cumulative sum at the end of each month, while the numbers you talked about are newly uploads for each month. From my query, for Dec 2016, the number of newly uploaded files by bots are 392,566, by users = 392,786. This is closed to what is shown on https://stats.wikimedia.org/wikispecial/EN/TablesWikipediaCOMMONS.htm.
I think the differences came from two sources:
1, I assume the numbers on https://stats.wikimedia.org/wikispecial/EN/TablesWikipediaCOMMONS.htm are computed at the end of each month and files could be deleted afterwards. For the numbers above, I used the image table and only counts the files that are still there on Oct 12, 2017.
2, According to commons bots, not all accounts being operated as bots has a bot flag, so I also include accounts with the keywords "bot_flag" or "bots" (see the query below).
Query for counting newly uploaded files on commons:
SELECT LEFT(img_timestamp, 6) AS yr_month, user_group, COUNT(*) AS n_files FROM ( -- Get active/inactive bots SELECT ug_user AS user_id, ug_group AS user_group FROM user_groups WHERE ug_group = 'bot' UNION SELECT ufg_user AS user_id, ufg_group AS user_group FROM user_former_groups WHERE ufg_group = 'bot' UNION -- Get user ids with bot categories in their user pages SELECT user.user_id, 'bot' AS user_group FROM user INNER JOIN ( -- all user page names with bot category SELECT REPLACE(page.page_title, '_', ' ') AS user_name FROM page INNER JOIN ( -- page ids with bot categories SELECT DISTINCT cl_from AS page_id FROM categorylinks WHERE cl_to REGEXP '_(bot_flag|bots)(_|$)' AND cl_type = 'page' ) AS bot_cat ON page.page_id=bot_cat.page_id WHERE page_namespace = 2 ) AS bot_name ON user.user_name=bot_name.user_name ) AS bots RIGHT JOIN image ON bots.user_id = image.img_user GROUP BY LEFT(img_timestamp, 6), user_group;
Cc: Nuria, Liuxinyu970226, Aklapper, mpopov, chelsyx, Abit, SandraF_WMF, Ramsey-WMF, Capt_Swing, debt, E1presidente, Jmmuguerza, GoranSMilovanovic, QZanden, EBjune, Acer, Avner, Gehel, FloNight, Susannaanas, Wikidata-bugs, PKM, Base, matthiasmullie, aude, Ricordisamoa, Fabrice_Florin, Raymond, Steinsplitter, Mbch331
_______________________________________________ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs