Hey,

This is the 24th and 25th weekly update from revision scoring team that we
have sent to this mailing list. We skipped a week due to travel and other
work.

Maintenance and robustness:

   - We improved the performance of RecentChanges fitlering in the ORES
   extension[1]


   - We built and ran a maintenance script to clean up duplicate cached
   data for the ORES extension[2,3]


   - We updated the editquality models for the new version of revscoring
   (1.3.0)[4] and made some upstream changes to json2tsv to make that easier[5]


   - We quited down some of our error reporting so that our logs take up
   less space[6]


Datasets:

   - We generated a dataset that uses the "wp10" prediction model to assess
   article quality in monthly intervals for English, French, and Russian
   Wikipedia[7].  This should enable new research into the quality dynamics of
   these wikis.


   - We generated a dataset of vandalism, spam, and attack page creations
   for building a new "draft quality" model[8]


Communication:

   - Presented about transparent/open AI development practices around ORES
   at the Association of Internet Researchers[9]


New development:

   - We've made substantial progress towards adding ORES data to
   MediaWiki's api.php endpoints with rcshow=oresreview[10] and rvprop=ores[11]


1. https://phabricator.wikimedia.org/T146111 -- hidenondamaging=1 query is
extremely slow on enwiki
2. https://phabricator.wikimedia.org/T145356 -- Ensure ORES data violating
constraints do not affect production
3. https://phabricator.wikimedia.org/T145503 -- Build a maintenance script
to clean up duplicate data
4. https://phabricator.wikimedia.org/T146410 -- Update editquality for
revscoring 1.3.0
5. https://phabricator.wikimedia.org/T146939 -- Add type decoding support
to tsv2json
6. https://phabricator.wikimedia.org/T146680 -- Quiet result.get Warning in
tasks
7. https://phabricator.wikimedia.org/T145655 -- Generate monthly article
quality dataset
8. https://phabricator.wikimedia.org/T135644 -- Generate spam and vandalism
new page creation dataset
9. https://phabricator.wikimedia.org/T147706 -- Present about ORES
transparency at AoIR
10. https://phabricator.wikimedia.org/T143616 -- Introduce
rcshow=oresreview and similar ones
11. https://phabricator.wikimedia.org/T143614 -- Introduce ORES rvprop

Sincerely,
Aaron from the Revision Scoring team
_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to