Halfak added a comment. |
@Glorian_WD and I have been discussion how we'll get features that will give us some signal about which properties are expected for specific types of items. Here's my skeleton proposal:
- query for most used statements (e.g. instance-of:human)
- for the top N most used properties, query for the most secondary properties (instance-of:human, occupation:author)
- for all items that pass some basic threshold of quality (e.g. has an external reference and >= N site-links) find the frequency of all other properties.
- build an index on this so it can be quickly looked-up during scoring.
TASK DETAIL
EMAIL PREFERENCES
To: Glorian_WD, Halfak
Cc: samuwmde, Lydia_Pintscher, StudiesWorld, Aklapper, Ricordisamoa, Sumit, Glorian_Yapinus, Halfak, Ladsgroup, Glorian_WD, D3r1ck01, Izno, Wikidata-bugs, aude, Alchimista, Mbch331
Cc: samuwmde, Lydia_Pintscher, StudiesWorld, Aklapper, Ricordisamoa, Sumit, Glorian_Yapinus, Halfak, Ladsgroup, Glorian_WD, D3r1ck01, Izno, Wikidata-bugs, aude, Alchimista, Mbch331
_______________________________________________ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs