AndrewTavis_WMDE added a comment.
@Manuel, based on the query provided in https://w.wiki/77FU (I took out the French comment at the end and regenerated the short link), it looks like the ontology is relatively clean if we keep it to the base subclasses with `wdt:P279`, but not if we go beyond that to the full graph with `wdt:P279*`. A summary: - In the case of `wdt:P279` the only outliers that we're getting are QIDs that are actually scholarly articles in themselves as they've had `P279` applied to them rather than `P31`. - Including these should be fine in that they would have been included anyway? - In the case of `wdt:P279*` we're getting all kinds of QIDs that do not fit the subgraph we're trying to describe. Examples include: - Q1519850 - Health Certificate <https://www.wikidata.org/wiki/Q1519850> - Q2992277 - women in the Victorian era <https://www.wikidata.org/wiki/Q2992277> - All kinds of QIDs referring to types of Covid passports, ids, etc - etc I'll report back in this issue on how the subgraph was defined in the original analysis :) TASK DETAIL https://phabricator.wikimedia.org/T342123 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: AndrewTavis_WMDE Cc: Aklapper, Manuel, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
_______________________________________________ Wikidata-bugs mailing list -- [email protected] To unsubscribe send an email to [email protected]
