Bugreporter added a comment.
In T303677#9035100 <https://phabricator.wikimedia.org/T303677#9035100>, @tfmorris wrote: > I'm surprised that this hasn't received any attention in 15 months. As an update to @Nikki 's numbers <https://phabricator.wikimedia.org/T303677#7789434> there are now on the order of 2.5 **BILLION** of these bot generated descriptions. The top 5 alone represent over 2 billion triples. That's a huge waste of resources! > > | Q# | Entity Type | Descriptions (Billions) | > | Q13442814 | scholarly article | 1.32 | > | Q4167836 | Wikimedia category | 0.60 | > | Q4167410 | Wikimedia disambiguation page | 0.11 | > | Q11266439 | Wikimedia template | 0.09 | > | Q101352 | family name | 0.06 | > | > > In addition to the usability and resource issues, there's also a substantial language equity issue associated with the lack of this functionality. The language with the largest number of descriptions is Dutch simply because there's a Dutch speaking bot operator who has vigorously added many, many machine generated descriptions <https://www.wikidata.org/wiki/User:Edoderoobot/Set-nl-description>. On the flip side, languages without the privilege of bot operators supporting them go wanting and have no way to disambiguate the terms that autocomplete / search offers them. Of course, if someone were to start adding machine generated descriptions for all those hundreds of languages, the situation would be completely untenable from a Blazegraph point of view. > > As an alternative to a textual description, I'll offer the suggestion to consider building an autocomplete widget <https://developers.google.com/freebase/v1/search-widget> which looks more like this: F37145761: Screen Shot 2023-07-21 at 2.22.16 PM.png <https://phabricator.wikimedia.org/F37145761> That's how Freebase Suggest <https://developers.google.com/freebase/v1/search-widget> did it back in 2008. Heck, you could even steal the code <https://github.com/googlearchive/freebase-suggest>. One non-obvious aspect of their implementation was that they used metaschema annotations of types as being "Notable" or interesting enough to show the user. Similarly the properties which were displayed varied by entity type and were controlled by metaschema notations, so you might have birth date and place for a person, but containing/parent entity for something like a town or species. Of course, even just a simple list of the P31 <https://phabricator.wikimedia.org/P31>'s would be better than the current situation. See also https://autodesc.toolforge.org/, which is already used in various tools (e.g. Mix'n'Match). Previous discussion (dated backed to 2012): https://www.wikidata.org/wiki/Wikidata:Automating_descriptions TASK DETAIL https://phabricator.wikimedia.org/T303677 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Bugreporter Cc: tfmorris, AndrewTavis_WMDE, Fuzheado, valerio.bozzolan, Lectrician1, waldyrious, Michael, DVrandecic, Bugreporter, Manuel, Nikki, Epidosis, Mahir256, Aklapper, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Lydia_Pintscher, Mbch331
_______________________________________________ Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org