Hey all,

thanks for sharing the paper, this is an interesting topic. I just wanted to 
point to some (own) prior work on entity summarization which is related to what 
you have done:

All the best

> Am 08.03.2018 um 19:00 schrieb Aidan Hogan <aid...@gmail.com>:
> Hey Raphaël,
> Thanks for the comments and the reference! And sorry we missed discussion of 
> your paper (which indeed looks at largely the same problem in a slightly 
> different context). If there's a next time, we will be sure to include it in 
> the related work.
> I am impressed btw to see a third-party evaluation of a Google tool. Also it 
> seems Google has room for improvement. :)
> Cheers,
> Aidan
> On 07-03-2018 13:43, Raphaël Troncy wrote:
>> Hey Aidan,
>> Great work, I loved it! You may want to (cite and) look at what we did 4 
>> years ago where we tried to reverse engineer a bit what Google is doing when 
>> choosing properties (and values) to show in its rich panels alongside 
>> popular entities.
>> The paper is entitled "What Are the Important Properties
>> of an Entity? Comparing Users and Knowledge Graph Point of View", 
>> https://www.eurecom.fr/~troncy/Publications/Assaf_Troncy-eswc14.pdf
>> ... and the code is on github to replicate: https://github.com/ahmadassaf/KBE
>>   Raphaël
>> Le 07/03/2018 à 05:53, Aidan Hogan a écrit :
>>> Hi all,
>>> Tomás and I would like to share a paper that might be of interest to the 
>>> community. It presents some preliminary results of a work looking at fully 
>>> automated methods to generate Wikipedia info-boxes from Wikidata. The main 
>>> focus is on deciding what information from Wikidata to include, and in what 
>>> order. The results are based on asking users (students) to rate some 
>>> prototypes of generated info-boxes.
>>> Tomás Sáez, Aidan Hogan "Automatically Generating Wikipedia Infoboxes from 
>>> Wikidata". In the Proceedings of the Wiki Workshop at WWW 2018, Lyon, 
>>> France, April 24, 2018.
>>> - Link: http://aidanhogan.com/docs/infobox-wikidata.pdf
>>> We understand that populating info-boxes is an important goal of Wikidata 
>>> and hence we thought we'd share some lessons learned.
>>> Obviously a lot of work is being put into populating info-boxes from 
>>> Wikidata, but the main methods at the moment seem to be template-based and 
>>> require a lot of manual labour; plus the definition of these templates 
>>> seems to be a difficult problem for classes such as person (where different 
>>> information will have different priorities for people of different 
>>> professions, notoriety, etc.).
>>> We were just interested to see how far we could get with a fully automated 
>>> approach using some generic ranking methods. Also we thought that something 
>>> like this could perhaps be used to generate a "default" info-box for 
>>> articles with no info-box and no associated template mapping. The paper 
>>> presents preliminary results along those lines.
>>> One interesting result is that a major factor in the evaluation of the 
>>> generated info-boxes was the importance of the value. For example, Barack 
>>> Obama has lots of awards, but perhaps only something like the Nobel Peace 
>>> Prize might be of relevance to show in the info-box (<- being intended as 
>>> an illustrative example rather than a concrete assertion of course!). 
>>> Another example is that sibling might not be an important attribute in a 
>>> lot of cases, but when that sibling is Barack Obama, then that deserves to 
>>> be in the info-box (<- how such cases could be expressed in a purely 
>>> template-based approach, we are not sure, but it would seem difficult).
>>> We assess the importance of values with PageRank. Assessing the importance 
>>> not only of attributes, but of values, turned out to be a major influence 
>>> on how highly our evaluators assessed the quality of the generated 
>>> info-boxes.
>>> This initial/isolated observation might be interesting since, to the best 
>>> of our understanding, the current wisdom on populating info-boxes from 
>>> Wikidata focuses on what attributes to present and in which order, but does 
>>> not consider the importance of values (aside from the Wikidata rank 
>>> feature, which we believe is more intended to assess relevance/timeliness, 
>>> than importance).
>>> Hence one of the most interesting (and surprising, for us at least) results 
>>> of the work is to suggest that it appears to be important to rank *values* 
>>> by importance (not just attributes) when considering what information the 
>>> user might be interested in.
>>> (There are limitations to PageRank measures, however, in that they cannot 
>>> assess, for example, the importance of a particular date, or, more 
>>> generally, datatype values.)
>>> In any case, we are looking forward to presenting these results at the Wiki 
>>> Workshop at WWW 2018, and any feedback or thoughts are welcome!
>>> Cheers,
>>> Aidan
>>> _______________________________________________
>>> Wikidata mailing list
>>> Wikidata@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikidata
> _______________________________________________
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata

Magnus Knuth

Hasso-Plattner-Institut für Digital Engineering gGmbH
Prof.-Dr.-Helmert-Str. 2-3
14482 Potsdam

Amtsgericht Potsdam, HRB 12184
Geschäftsführung: Prof. Dr. Christoph Meinel

tel:     +49 331 5509 547
email:   magnus.kn...@hpi.de
web:     http://www.hpi.de/
webID:   http://magnus.13mm.de/

Wikidata mailing list

Reply via email to