| Addshore added a subscriber: JAllemandou. Addshore added a comment. |
@JAllemandou already went ahead and working some magic on the data that is already in hadoop.
So in January 2018 there were:
- Entities: 42336942
- Entities with links Inward (to the entity): 10312826
- Entities with links outward (from the entity): 40096647
- Entities with some kind of link (in or out): 40115806
- Therefor totally orphan entities can be seen at roughly: 2,221,136
It would probably also be worth counting sitelinks as links for the tracking of orphan entities (at least in my opinion)
(This requires the graph construction that we did for the first numbers to change so I won't both posting the number here for 2018-01)
This analysis was done with the following gist: https://gist.githubusercontent.com/jobar/ec44542614c0fe261a23cc3b4acf8e00/raw/6018e5d62401a2ca86f46580a547cb025932b8ca/degrees-analysis
This is the sort of thing that we will want to work towards having run on a regular basis once wikidata is regularly in hadoop
TASK DETAIL
EMAIL PREFERENCES
To: Addshore
Cc: JAllemandou, Addshore, Aklapper, abian, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, Wikidata-bugs, aude, Mbch331
Cc: JAllemandou, Addshore, Aklapper, abian, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, Wikidata-bugs, aude, Mbch331
_______________________________________________ Wikidata-bugs mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
