Hi Everyone, The next Wikimedia Research Showcase will be live-streamed Wednesday, August 13 2018 at 11:30 AM (PDT) 18:30 UTC.
YouTube stream: https://www.youtube.com/watch?v=OGPMS4YGDMk As usual, you can join the conversation on IRC at #wikimedia-research. And, you can watch our past research showcases here. <https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase#Upcoming_Showcase> Hope to see you there! This month's presentations is: *Quicksilver: Training an ML system to generate draft Wikipedia articles and Wikidata entries simultaneously* John Bohannon and Vedant Dharnidharka, Primer The automatic generation and updating of Wikipedia articles is usually approached as a multi-document summarization task: Given a set of source documents containing information about an entity, summarize the entity. Purely sequence-to-sequence neural models can pull that off, but getting enough data to train them is a challenge. Wikipedia articles and their reference documents can be used for training, as was recently done <https://arxiv.org/abs/1801.10198> by a team at Google AI. But how do you find new source documents for new entities? And besides having humans read all of the source documents, how do you fact-check the output? What is needed is a self-updating knowledge base that learns jointly with a summarization model, keeping track of data provenance. Lucky for us, the world’s most comprehensive public encyclopedia is tightly coupled with Wikidata, the world’s most comprehensive public knowledge base. We have built a system called Quicksilver uses them both. _______________________________________________ Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l New messages to: Wikimediaemail@example.com Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe>