Hi Everyone,

The next Research Showcase will be live-streamed this Wednesday, February
21, 2018 at 11:30 AM (PST) 18:30 UTC.

YouTube stream: https://www.youtube.com/watch?v=fpmRWCE7F_I

As usual, you can join the conversation on IRC at #wikimedia-research. And,
you can watch our past research showcases here

This month's presentation:

*Visual enrichment of collaborative knowledge bases*

By Miriam Redi, Wikimedia Foundation

Images allow us to explain, enrich and complement knowledge without
language barriers [1]. They can help illustrate the content of an item in a
language-agnostic way to external data consumers. Images can be extremely
helpful in multilingual collaborative knowledge bases such as Wikidata.

However, a large proportion of Wikidata items lack images. More than 3.6M
Wikidata items are about humans (Q5), but only 17% of them have an image
associated with them. Only 2.2M of 40 Million Wikidata items have an image.
A wider presence of images in such a rich, cross-lingual repository could
enable a more complete representation of human knowledge.

In this talk, we will discuss challenges and opportunities faced when using
machine learning and computer vision tools for the visual enrichment of
collaborative knowledge bases. We will share research to help Wikidata
contributors make Wikidata more “visual” by recommending high-quality
Commons images to Wikidata items. We will show the first results on
free-licence image quality scoring and recommendation and discuss future
work in this direction.

[1] Van Hook, Steven R. "Modes and models for transcending cultural
differences in international classrooms." Journal of Research in
International Education 10.1 (2011): 5-27.

*Backlogs—backlogs everywhere: Using machine classification to clean up the
new page backlog*

By Aaron Halfaker, Wikimedia Foundation

If there's one insight that I've had about the functioning of Wikipedia and
other wiki-based online communities, it's that eventually self-directed
work breaks down and some form of organization becomes important for task
routing.  In Wikipedia specifically, the notion of "backlogs" has become
dominant.  There's backlogs of articles to create, articles to clean up,
articles to assess, new editor contributions to review, manual of style
rules to apply, etc.  To a community of people working on a backlog, the
state of that backlog has deep effects on their emotional well being.  A
backlog that only grows is frustrating and exhausting.

Backlogs aren't inevitable though and there are many shapes that backlogs
can take.  In my presentation, I'll tell a story about where English
Wikipedia editors defined a process and set of roles that formed a backlog
around new page creations.  I'll make the argument that this formalization
of quality control practices has created a choke point and that
alternatives exist. Finally I'll present a vision for such an alternative
using models that we have developed for ORES, the open machine prediction
service my team maintains.

