Addshore added a subscriber: JAllemandou.
Addshore added a comment.

@JAllemandou already went ahead and working some magic on the data that is already in hadoop.
So in January 2018 there were:

  • Entities: 42336942
  • Entities with links Inward (to the entity): 10312826
  • Entities with links outward (from the entity): 40096647
  • Entities with some kind of link (in or out): 40115806
  • Therefor totally orphan entities can be seen at roughly: 2,221,136

It would probably also be worth counting sitelinks as links for the tracking of orphan entities (at least in my opinion)
(This requires the graph construction that we did for the first numbers to change so I won't both posting the number here for 2018-01)

This analysis was done with the following gist: https://gist.githubusercontent.com/jobar/ec44542614c0fe261a23cc3b4acf8e00/raw/6018e5d62401a2ca86f46580a547cb025932b8ca/degrees-analysis

This is the sort of thing that we will want to work towards having run on a regular basis once wikidata is regularly in hadoop


TASK DETAIL
https://phabricator.wikimedia.org/T202894

EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Addshore
Cc: JAllemandou, Addshore, Aklapper, abian, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, Wikidata-bugs, aude, Mbch331
_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to