hoo added a comment.
In https://phabricator.wikimedia.org/T132839#2280617, @thiemowmde wrote: > We also came up with a possible improvement: Some properties like "instance of" and "Commons category" are not selective. The fact that this property exists on an item does not say anything. We think it's a good idea to add such properties to a "non-selective" blacklist (or to the existing blacklist). This should reduce noise. We have special handling for instance of and subclass of that avoid this behaviour (these are "classifying properties"). Excluding very generic ones like identifiers and certain string ones also is probably a good idea (or, in the long run, weight them lower?). In https://phabricator.wikimedia.org/T132839#2281371, @thiemowmde wrote: > FYI, I did an other run of code review on https://github.com/Wikidata-lib/PropertySuggester-Python and https://github.com/Wikidata-lib/PropertySuggester and could not find more suspicious code. The Python script should produce massive amounts of warnings when a datatype is missing. Does this happen? Are these logs reviewed after the script is run? I saw it before, but very rarely (like once or twice for a dump run at some point). I'll probably do a new dump run tomorrow and will examine the logs after, but I don't think that's going to give us any new insights. TASK DETAIL https://phabricator.wikimedia.org/T132839 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: hoo Cc: Tobi_WMDE_SW, daniel, mkroetzsch, Stashbot, thiemowmde, JanZerebecki, Lydia_Pintscher, hoo, Sjoerddebruin, Nikki, Aklapper, D3r1ck01, Izno, Wikidata-bugs, aude, Mbch331 _______________________________________________ Wikidata-bugs mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
