Am 01.07.2014 22:23, schrieb Markus Krötzsch:
P.S. One weakness of my algorithm you can already see: it has troubles estimating the relevance of very rare properties, such as "Minor Planet Center observatory code" above. A single wrong annotation may then lead to wrong suggestions. Also, it seems from my list under (2) that some Grade I listed buildings are ships. This seems to be an error that is amplified by the fact that property "masts" is used only 11 times in the dataset I evaluated (last week's data). I guess the new property suggester rather errs on the other side, being tricked into suggesting very frequent properties even in places that don't need them.
However, it is obviously better if the algorithm performs well for frequently used properties. Isn't it possible to combine those two systems so they improve each other. One could check how often the property is used and then rely on Markus' or the students' algorithm.

Best regards,
Bene

_______________________________________________
Wikidata-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Reply via email to