Hey, It's in arxiv: https://arxiv.org/abs/1703.03861 Any feedback is welcome :)
Best On Fri, Mar 17, 2017 at 3:37 PM Richard Nevell < richard.nev...@wikimedia.org.uk> wrote: > Having guidance on quality helps people learning about Wikidata understand > what they should be aiming for. > > The paper on vandalism detection in Wikidata sounds interesting, where can > I find it? > > Richard > > On 17 March 2017 at 09:09, Gerard Meijssen <gerard.meijs...@gmail.com> > wrote: > > > Hoi, > > I noticed the notion about "quality in Wikidata". The approach is very > much > > in line with what is the norm in Wikipedia. This is inot the right > approach > > for Wikidata. Many of the items in Wikidata can be of high "quality"; ie > > the statements have a source and there are enough labels but the true > value > > of these items are in the use of these items as statements in other > items.. > > (for instance a university indicates that someone studied there). > Another > > quality point is that for authors a VIAF statements allows for the > linking > > in Wikipedias in external sources. This is of a high importance, it makes > > Wikidata useful and, if that is not of a quality consideration what is? > > > > One other aspect of Wikidata is that it is still highly immature. Just > > consider the statistics for labels and statements [1] . This is only the > > first month where less than 10% of our items have no statement.. We talk > > about quality but quality should have a practical meaning. Just saying > this > > or that item is so good, it makes for stamp collecting. The point of a > > stamp is not to collect them it is to send mail. Quality means that we > know > > how many articles have been written in one or more editathons. It is in > > finding for a collection of items that it is better known what award, > > schooling has been achieved by the people that was written for. It is in > > using Wikidata to indicate what categories could be in what Wikipedia > > article. > > > > Quality needs to be actionable. What is the use of static quality? > > Thanks, > > GerardM > > > > > > [1] https://tools.wmflabs.org/wikidata-todo/stats.php?reverse > > > > On 17 March 2017 at 02:19, Pine W <wiki.p...@gmail.com> wrote: > > > > > Sharing some good news, both about the progress of ORES and (my primary > > > inspiration for sharing this email) significant improvements in article > > > quality thanks to WikiProject Women scientists. The latter has been > > > designated as the Keilana Effect. > > > > > > Pine > > > > > > > > > ---------- Forwarded message ---------- > > > From: Aaron Halfaker <aaron.halfa...@gmail.com> > > > Date: Thu, Mar 16, 2017 at 2:14 PM > > > Subject: Re: [Wikitech-l] The Revision Scoring weekly update > > > To: Application of Artificial Intelligence and other advanced computing > > > strategies to Wikimedia Projects <a...@lists.wikimedia.org> > > > Cc: wikitech-l <wikitec...@lists.wikimedia.org> > > > > > > > > > Hey folks! > > > > > > I should really stop calling this a weekly update because it's getting > a > > > bit silly at this point. :) But if it were a weekly update, it would > > > cover the weeks of 42 - 46. > > > > > > *Highlights:* > > > > > > - 3 new models: Finnish Wikipedia (reverted) and Estonian Wikipedia > > > (damaging & goodfaith) > > > > > > > > > - We estimated and agreed on funding for ORES servers in the next > year > > > with Operations > > > > > > > > > - We published a paper about vandalism detection in Wikidata and a > > blog > > > post about the massive effect of some initiatives on coverage of > Women > > > Scientists in Wikipedia. > > > > > > > > > *New development:* > > > > > > - We added recall-based threshold metrics to the new draftquality > > model > > > which should help tool devs know what which new page creations to > > > highlight > > > for review[1] > > > > > > > > > - We added optional notices for ORES pages which will help us > visually > > > distinguish our experimental install in WMFlabs from the Prod > install > > ( > > > ores.wikimedia.org)[2] > > > > > > > > > - We added basic language support for Finish (Thanks 4shadoww)[3] > and > > > deployed a 'reverted' model[4] > > > > > > > > > - We lead a discussion in Wikidata about "item quality" that > resulted > > in > > > a Wikipedia 1.0 like scale for Wikidata quality[5,6] and designed a > > > Wikilabels form to capture the gist of it[7] > > > > > > > > > - We enabled the ORES Review Tool on Czech Wikipedia[8] > > > > > > > > > - We configured ChangeProp to use our new minified JSON output to > save > > > bandwidth[9] > > > > > > > > > - We extended the Estonian language assets (Thanks Cumbril)[10] and > > > deployed the 'damaging' and 'goodfaith' models[11,12] > > > > > > > > > - We enabled a testing model for 'goodfaith' on the Beta Cluster to > > make > > > it easier for the Collaboration team to run tests with their new > > filter > > > interface[13] > > > > > > > > > - We created a new "precache" endpoint that will allow us to > > > de-duplicate configuration with ChangeProp and handle all routing in > > > ORES > > > locally[14] > > > > > > > > > *Resourcing:* > > > > > > - We completed a 2 year estimate of ORES resource needs and > discussed > > > funding (capital expendature) for ORES in the coming fiscal > year[15]. > > > This > > > will allow us to continue to grow ORES both in number of models and > in > > > scoring capacity. > > > > > > > > > *Communications:* > > > > > > - Amir improved the KDD paper based on review feedback[16] and got > it > > > published[17] > > > > > > > > > - We published a blob post about our measurements of WikiProject > Women > > > Scientists[18,19] -- "The Keilana Effect" > > > > > > > > > - Thanks to Cumbril's work, the Estonian labeling campaing was > > > finished[20] > > > > > > > > > *Deployments:* > > > > > > - In early February, we deployed a new set of translations to > > Wikilabels > > > (specifcally targeting Romanian Wikipedia)[21] > > > > > > > > > - In mid-February, we deployed some fixes to ORES documentation and > > > response formatting[22] > > > > > > > > > - In mid-March, we deployed 3 new scoring models and ORES > notices[23] > > > > > > > > > *Maintenance and robustness:* > > > > > > - We fixed a serious issue in the "mwoauth" library that Wikilabels > > > depends on[24] > > > > > > > > > - We reduced the number of revisions per request that we could > receive > > > via api.php[25] > > > > > > > > > - We investigated a scap issue that broke ORES deployment[26] > > > > > > > > > - We fixed a minor issue with JSON minification behavior[27] and > > > hard-coding of the location of ORES in the documentation[28] > > > > > > > > > - We improved performance of ORES filters on MediaWiki[29] > > > > > > > > > - We improved the language describing ORES behavior on > > > Special:Contributions[30] > > > > > > > > > - We added a notice to the Wikipages that Dexbot maintains about its > > > behavior[31] > > > > > > > > > - We added notices to ores.wmflabs.org about it's experimental > > > nature[32] > > > > > > > > > - We fixed some issues with testing Finnish language assets[33] > > > > > > > > > - We fixed some styling issues that resulted from an upgrade of OOJS > > > UI[34] > > > > > > > > > 1. https://phabricator.wikimedia.org/T157454 -- Add recall based > > > thresholds > > > to draftquality model > > > 2. https://phabricator.wikimedia.org/T150962 -- Add an optional notice > > to > > > ORES main and ui pages > > > 3. https://phabricator.wikimedia.org/T158587 -- Add language support > for > > > Finnish > > > 4. https://phabricator.wikimedia.org/T160228 -- Train/test reverted > > model > > > for fiwiki > > > 5. https://phabricator.wikimedia.org/T157489 -- [Discuss] item quality > > in > > > Wikidata > > > 6. https://www.wikidata.org/wiki/Wikidata:Item_quality > > > 7. https://phabricator.wikimedia.org/T155828 -- Design item_quality > form > > > for Wikidata > > > 8. https://phabricator.wikimedia.org/T151611 -- Enable ORES Review > Tool > > on > > > Czech Wikipedia > > > 9. https://phabricator.wikimedia.org/T157693 -- Use minified JSON > format > > > in > > > ChangeProp > > > 10. https://phabricator.wikimedia.org/T160193 -- Extend estonian > > language > > > assets from Wiki page > > > 11. https://phabricator.wikimedia.org/T159608 -- Train/test > > > damaging/goodfaith models for etwiki > > > 12. https://phabricator.wikimedia.org/T130280 -- Deploy edit quality > > > models > > > for etwiki > > > 13. https://phabricator.wikimedia.org/T160467 -- Enable 'goodfaith' on > > > testwiki on Beta Cluster > > > 14. https://phabricator.wikimedia.org/T148714 -- Create generalized > > > "precache" endpoint for ORES > > > 15. https://phabricator.wikimedia.org/T157222 -- Estimate ORES capex > for > > > FY2017-18 > > > 16. https://phabricator.wikimedia.org/T148443 -- Improve the KDD paper > > > based on the review > > > 17. https://arxiv.org/abs/1703.03861 > > > 18. https://phabricator.wikimedia.org/T160078 -- Blog post about wp10 > > > measurements of Women Scientists > > > 19. https://blog.wikimedia.org/2017/03/07/the-keilana-effect/ > > > 20. https://phabricator.wikimedia.org/T129702 -- Complete etwiki edit > > > quality campaign > > > 21. https://phabricator.wikimedia.org/T157580 -- Deploy Romanian > > > translations for Wiki labels > > > 22. https://phabricator.wikimedia.org/T157842 -- Prod deployment of > ORES > > > 23. https://phabricator.wikimedia.org/T160279 -- Deploy ores in prod > > > (Mid-March) > > > 24. https://phabricator.wikimedia.org/T157858 -- mwoauth is broken > > > 25. https://phabricator.wikimedia.org/T157983 -- Reduce the number of > > > revisions that can be requested in one batch > > > 26. https://phabricator.wikimedia.org/T157623 -- Investigate failed > ORES > > > deployment > > > 27. https://phabricator.wikimedia.org/T157721 -- Investigate default > > JSON > > > minification behavior in production > > > 28. https://phabricator.wikimedia.org/T157723 -- ORES swagger is > > > hard-coded > > > for wmflabs > > > 29. https://phabricator.wikimedia.org/T152585 -- rcshow=oresreview is > > slow > > > 30. https://phabricator.wikimedia.org/T158862 -- Fix message in > > > Special:Contributions > > > 31. https://phabricator.wikimedia.org/T158899 -- Add notice about > Dexbot > > > overwriting manual changes to our tracking table. > > > 32. https://phabricator.wikimedia.org/T159055 -- Add a notice to > > > ores-wmflabs-deploy about "experimental" nature > > > 33. https://phabricator.wikimedia.org/T160192 -- Fix testing issues in > > > finnish language assets > > > 34. https://phabricator.wikimedia.org/T160258 -- Fix minor styling > > issues > > > with OOJS-UI in wikilabels > > > > > > Sincerely, > > > Aaron from the Scoring Platform team > > > _______________________________________________ > > > Wikitech-l mailing list > > > wikitec...@lists.wikimedia.org > > > https://lists.wikimedia.org/mailman/listinfo/wikitech-l > > > _______________________________________________ > > > Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/ > > > wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/ > > > wiki/Wikimedia-l > > > New messages to: Wikimedia-l@lists.wikimedia.org > > > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, > > > <mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe> > > _______________________________________________ > > Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/ > > wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/ > > wiki/Wikimedia-l > > New messages to: Wikimedia-l@lists.wikimedia.org > > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, > > <mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe> > > > > > > -- > Richard Nevell > Project Coordinator > Wikimedia UK - sign up to our newsletter <http://eepurl.com/cnYOw5> > +44 (0) 20 7065 0921 <+44%2020%207065%200921> > > Wikimedia UK is a Company Limited by Guarantee registered in England and > Wales, Registered No. 6741827. Registered Charity No.1144513. Registered > Office 4th Floor, Development House, 56-64 Leonard Street, London EC2A 4LT. > United Kingdom. Wikimedia UK is the UK chapter of a global Wikimedia > movement. The Wikimedia projects are run by the Wikimedia Foundation (who > operate Wikipedia, amongst other projects). > > *Wikimedia UK is an independent non-profit charity with no legal control > over Wikipedia nor responsibility for its contents.* > _______________________________________________ > Wikimedia-l mailing list, guidelines at: > https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and > https://meta.wikimedia.org/wiki/Wikimedia-l > New messages to: Wikimedia-l@lists.wikimedia.org > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, > <mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe> _______________________________________________ Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe>