[Wikitech-l] The Revision Scoring weekly update

Aaron Halfaker Tue, 07 Feb 2017 12:31:29 -0800

Hey folks!

This is the 32 - 41st weekly update from the revision scoring team that we
have sent to this mailing list.  We've been busy, but our reporting fell
behind.  So here I am getting us caught up!  This is going to be a long
one.  Bear with me.

One major thing we've done in the past few weeks is drafted and presented a
proposal to increase the resourcing for the ORES project in the 2017 Fiscal
Year. Currently, we're just one fully funded staff member (halfak) and
partially funded contractor (Amir1) working with a bunch of volunteers.
We're proposing to staff the team with fulltime engineers, a liaison and a
tech writer. See a full draft of our proposal and pitch deck here:
https://meta.wikimedia.org/wiki/Wikimedia_Foundation_Scoring_Platform_team

*New development:*

We've expanded support for our "editquality" models to more wikis and
improved the performance of some of the models.

- We scaled up the number of observations for Indonesian Wikipedia to
100k[1]

- We added language support for Romanian[2] and built the basic
"reverted" model[3]

- We trained and tested "damaging" and "goodfaith" models for Czech
Wikipedia[4]

- We implemented some params in our training utilites to control memory
usage[5]

- We deployed all of the above to Wikimedia Labs[6]. A production
deployment is coming soon.

Prompted by the 2016 community wishlist[7], we've implemented a
"draftquality" model for evaluating new page creations.

- We researched deletion reasons on English Wikipedia[8] and created a
labeled dataset using the deletion log.

- We engineered a set of features to predict the quality of new
articles[9] and built a model[10]

- We generated a set of datasets[11,12,13] to make it easier for
volunteers and external researchers to help us audit the performance of the
model.

- We deployed the model on WMFLabs[14] and announced it's presence to a
few interested patrollers in English Wikipedia

- We've started the process of deploying the model in production[15,16]

We completed a project exploring the use of advance natural-language
processing strategies to extract new signal about vandalism, article
quality and problematic new articles. Regretfully, memory issues prevent
us from trivially putting this into production[17], so we're looking into
alternative strategies[18].

- We implemented a strategy for extracting sentence from Wikitext[19]

- We built sentence banks for personal attacks[20, vandalism[21],
spam[22], and Featured Articles[23].

- We built PCFG-based models[24] and analyzed their ability to
differentiate[25]

We've been working with the Collaboration Team[26] on their Edit Review
Improvments project[27]

- We defined and implemented a set of new precision-based test
statistics that will inform thresholds used in their new user interface[28]

- But we also decided to continue to report recall-based test statistics
as well[29]

Based on advice from engineers on the Collaboration Team, we've begun the
process of converting Wiki labels[30] to a stand-alone tool in labs.

- We generalize the gadget interface so that it can handle all
langauges/wikis[31]

- We implemented a means to auto-configure wikis based on the
dbname[32,33] and that allowed us to simplify configuration[34]

- We also implemented some performance improvements with minification,
bundling[35]

*Labeling:*

In the past few weeks, we've set up labeling campaigns for a few wikis.

- We deployed an edit types campaign for Catalan Wikipedia[36]

- We deployed an edit quality campagin for Chinese[37] and Romanian[38]
Wikipedias

- We deployed a new type of campaign for English Wikipedia --
"discussion quality" asks editors to label talk posts as "toxic" or not[39]

*Maintenance and robustness:*

We've solved a large set of problems with logging issues, compatibility
with wikibase, and we've made minor improvements to performance.

- We addressed a few bugs in the ORES Review Tool[40,44]

- We quieted some errors from our logging in ORES[41,45]

- We updated our code to work with a wikibase schema change[42]

- We fixed a language fallback pattern in Wiki labels[43]

- We set up monitoring on ORES database disk sizes[46]

- We fixed some issues with scap, phabricator's diffusion and other
supporting systems so that we can continue deploying to beta labs[47]

- We split our assets repo so that we can let our WMFLabs deploy get
ahead of the Production deployment[48]

- ORES can now minify its JSON responses[49]

- We identified a bug in flask-assets and worked around it in our local
installation of Wiki labels[50]

*Communications and outreach:*

We had a big presence at the Wikimedia Developer summit, we've drafted a
resourcing proposal, and we've made some announcements about upcoming plans
for the ORES Review tool.

- We facilitated the "Artificial Intelligence to build and navigate
content" track[51]

- We ran a session for building an AI wishlist[52] and captured notes
about more than 20 new AI proposals on a new tag in phabricator[53]

- We also ran a session discussion the ethics and dangers of advanced
algorithms mediating our processes[54]

- We helped facilitate a session about where to surface current AIs in
Wikimedia Projects[55]

- We held a discussion with Legal about licensing labeled data that
comes out of Wiki labels[56] and updated the interface to state the CC0
license clearly[57]

- We worked with the Reading Infrastructure team to analyze the
consumption of "oresscores" through the MediaWiki API[58]

- We drafted a pitch for increasing the resources for our team[59]

- We worked with the Collaboration team to announce that they'll
experimenting with a new RecentChanged filtering strategy in the ORES
Review Tool[60,61]

1. https://phabricator.wikimedia.org/T147107 -- Scale up the number of
observations for idwiki to 100k
2. https://phabricator.wikimedia.org/T152482 -- Add language support for
Romanian
3. https://phabricator.wikimedia.org/T156504 -- Build reverted model for
Romanian Wikipedia
4. https://phabricator.wikimedia.org/T156492 -- Train and test
damaging/goodfaith models for Czech Wikipedia
5. https://phabricator.wikimedia.org/T156645 -- Add '--workers' param to
cv_train utility
6. https://phabricator.wikimedia.org/T154856 -- Clean up dependencies and
deploy newest ORES & Models in labs
7.
https://meta.wikimedia.org/wiki/2016_Community_Wishlist_Survey/Categories/Moderation_tools#Quality_scoring_for_new_articles
8.
https://meta.wikimedia.org/wiki/Research:Automated_classification_of_draft_quality
9. https://phabricator.wikimedia.org/T148580 -- Build feature set for draft
quality model
10. https://phabricator.wikimedia.org/T148038 -- [Epic] Build draft quality
model (spam, vandalism, attack, or OK)
11. https://phabricator.wikimedia.org/T148581 -- Extract features for
deleted page (draft quality model)
12. https://phabricator.wikimedia.org/T156642 -- Generate scored dataset
for 2016-08 - 2017-01
13. https://phabricator.wikimedia.org/T156643 -- Generate extracted
features for 2016-08 - 2017-01
14. https://phabricator.wikimedia.org/T155576 -- Deploy draftquality models
to WMFLabs
15. https://phabricator.wikimedia.org/T156835 -- Create package stuff for
draftquality
16. https://phabricator.wikimedia.org/T157049 -- Create new repo:
research-ores-draftquality
17. https://phabricator.wikimedia.org/T148867#2816566 -- Memory footprint
is enormous!
18. https://phabricator.wikimedia.org/T155111 -- [Spike] Investigate use of
Apertium LTtoolbox API in labs/production
19. https://phabricator.wikimedia.org/T148867 -- Implement sentences
datascources
20. https://phabricator.wikimedia.org/T148035 -- Sentence bank for personal
attacks
21. https://phabricator.wikimedia.org/T148034 -- Sentence bank for vandalism
22. https://phabricator.wikimedia.org/T148032 -- Sentence bank for spam
23. https://phabricator.wikimedia.org/T148033 -- Sentence bank for Featured
Articles
24. https://phabricator.wikimedia.org/T148037 -- Generate PCFG sentence
models
25. https://phabricator.wikimedia.org/T151819 -- Analyze differentiation of
FA, Spam, Vandalism, and Attack models/sentences.
26. https://www.mediawiki.org/wiki/Collaboration
27. https://www.mediawiki.org/wiki/Edit_Review_Improvements
28. https://phabricator.wikimedia.org/T151970 -- Implement new
precision-based test stats for editquality models
29. https://phabricator.wikimedia.org/T156644 -- Restore
recall-threshold-based metrics for editquality models.
30. https://meta.wikimedia.org/wiki/Wiki_labels
31. https://phabricator.wikimedia.org/T151120 -- Generalize standalone
gadget interface
32. https://phabricator.wikimedia.org/T154433 -- Auto config wikilabels
using dbnames
33. https://phabricator.wikimedia.org/T155439 -- Use module loader to load
JS/CSS from wikis
34. https://phabricator.wikimedia.org/T154693 -- Remove host from
wikilabels config -- infer from request
35. https://phabricator.wikimedia.org/T154122 -- Minification and bundling
for wikilabels assets
36. https://phabricator.wikimedia.org/T152965 -- Deploy cawiki edit types
campaign
37. https://phabricator.wikimedia.org/T152561 -- Deploy zhwiki edit quality
campaign
38. https://phabricator.wikimedia.org/T156357 -- Deploy edit quality
campaign for Romanian Wikipedia
39. https://phabricator.wikimedia.org/T156303 -- Deploy "Discussion
quality" campaign in wikilabels
40. https://phabricator.wikimedia.org/T152542 -- Undefined method
ORES\Hooks::getDamagingThreshold()
41. https://phabricator.wikimedia.org/T146681 -- Quiet TimeoutError in
celery logging
42. https://phabricator.wikimedia.org/T154168 -- Quantity changes broke ORES
43. https://phabricator.wikimedia.org/T154897 -- Chinese translations are
not being loaded
44. https://phabricator.wikimedia.org/T155500 -- Fatal exception of type
"DBQueryError" on sorting ORES contributions
45. https://phabricator.wikimedia.org/T157078 -- ores logspam: Model
contains an error
46. https://phabricator.wikimedia.org/T155482 -- Set up monitoring for ORES
redis database
47. https://phabricator.wikimedia.org/T157135 -- Fix broken beta-labs deploy
48. https://phabricator.wikimedia.org/T154436 -- Split wheels repo into
Prod/WMFLabs branches and maintain independence
49. https://phabricator.wikimedia.org/T155931 -- Minify json responses
50. https://phabricator.wikimedia.org/T154865 -- assets url return empty
string
51. https://phabricator.wikimedia.org/T147708 -- Artificial Intelligence to
build and navigate content
52. https://phabricator.wikimedia.org/T147710 -- What should an AI do you
for you? Building an AI Wishlist.
53. https://phabricator.wikimedia.org/tag/artificial-intelligence/
54. https://phabricator.wikimedia.org/T147929 -- Algorithmic dangers and
transparency -- Best practices
55. https://phabricator.wikimedia.org/T148690 -- Where to surface AI in
Wikimedia Projects
56. https://phabricator.wikimedia.org/T145024 -- Licensing of labeled data
57. https://phabricator.wikimedia.org/T156052 -- Add notice of CC0 status
of Wikilabels data to UI & Docs
58. https://phabricator.wikimedia.org/T156273 -- Identify baseline api.php
Action API consumption
59. https://phabricator.wikimedia.org/T157470 -- Draft proposal/pitch for
ORES resourcing
60. https://phabricator.wikimedia.org/T150855 -- Gather assets for post
about ORES review tool including ERI filters
61. https://phabricator.wikimedia.org/T150858 -- Post about ORES review
tool including ERI filters

Sincerely,
Aaron from the Revision Scoring Scoring Platform team
_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] The Revision Scoring weekly update

Reply via email to