Hi!

> In Freebase, we had bot scripts that went through and removed "Lists of
> Things" topic entities since they are lists of entities and not useful
> clumped together and normalized in a graph database.

Why delete them? Wikidata has a number of things which are not your
standard "entity" - lists, sources, news, quotes, service entries,
narrative articles (e.g.
https://en.wikipedia.org/wiki/Control_of_fire_by_early_humans - it's not
exactly "entity" like "human" or "fire"), etc. So I don't think the
approach that singles out and excludes lists would help much - if you
have an application that needs "individual entities" like "Douglas
Adams" or "London" and exclude other types will have to exclude much
more than just lists - but I think the approach of asking for exactly
what you need and ignoring the rest may prove more efficient. I'm not
sure there's really well-defined criteria to specify what "individual
entity" actually is - I'm sure you have one that matches your
application, but some other application may have completely different one.
Generally, this can be solved by better classification I think, but so
far I'm not sure what to base this classification on.
-- 
Stas Malyshev
smalys...@wikimedia.org

_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Reply via email to