I think we have this: https://www.wikidata.org/wiki/Wikidata:Database_reports/Popular_items_without_claims
Not sure how much up to date this is. Yaroslav On Fri, Feb 28, 2025 at 8:28 PM Yaron Koren <[email protected]> wrote: > Perhaps I count as a SPARQL expert now, but I do see one easy way to see > all the Wikidata items with no statements (and are not lexemes): > > > https://query.wikidata.org/#SELECT%20%3Fitem%20%3Fwiki%0AWHERE%20%7B%0A%20%20%3Fitem%20wikibase%3Astatements%200%20.%0A%20%20MINUS%20%7B%20%3Fitem%20dct%3Alanguage%20%5B%5D%20%7D%20.%20%23%20exclude%20lexemes%0A%20%20%0A%7D > > Also, here is a query to get the true "duds" - no statements, no lexemes, > and no Wikipedia/Wikimedia articles - it looks like there are about 8,000 > of these, so thankfully not really an "elephant": > > > https://query.wikidata.org/#SELECT%20DISTINCT%20%3Fitem%20WHERE%20%7B%0A%20%20%3Fitem%20wikibase%3Astatements%200%20.%0A%20%20MINUS%20%7B%20%3Fitem%20dct%3Alanguage%20%5B%5D%20%7D%20.%20%23%20exclude%20lexemes%0A%20%20MINUS%20%7B%20%5B%5D%20schema%3Aabout%20%3Fitem%3B%20schema%3AisPartOf%20%5B%5D%20%7D%20%23%20exclude%20items%20with%20no%20Wikipedia%2C%20etc.%20article%0A%7D > > Finally - on a minor no, it looks there are still about 2,000 human > settlements in Wikidata without a country: > > https://wikidatawalkabout.org/?c=Q486972&lang=en&f.P17=novalue > > This is not meant to sound like a criticism - Romaine, you have obviously > made an enormous improvement there! And perhaps the remaining ones are > difficult to categorize. > > -Yaron > > On Fri, Feb 28, 2025 at 9:57 AM Nicolas VIGNERON < > [email protected]> wrote: > >> Hi y'all, >> >> Good ideas. >> >> Queries for such a big number of items are often timing out. >> Here is a working QLever query for items with more than 10 sitelinks : >> https://qlever.cs.uni-freiburg.de/wikidata/VdiLsm. There is only one >> result, you can decrease the value for more results. >> Reminder, QLever results are not updated in real time, it's based on >> dumps (who are late because right now, results are from 29.01.2025). >> >> Cheers, >> Nicolas >> >> Le ven. 28 févr. 2025 à 18:05, Amir E. Aharoni < >> [email protected]> a écrit : >> >>> I tried some queries, and they all timed out :( >>> >>> I'm not very good at SPARQL. >>> >>> But I agree with Andy: Dividing hundreds of thousands of items into >>> small groups that can be processed by people who are likely to know >>> something relevant about those items, is probably a better way to try to >>> handle it than just looking at a huge pile of items. >>> >>> Some ways to divide them that I can think of immediately: >>> 1. Having a sitelink to particular languages. >>> 2. Having a label or a description in a particular languages. >>> 3. Having certain characteristics in the label, like length, or presence >>> of certain characters (even a mostly arbitrary characteristic, like "label >>> starts with the letters 'Mi' " or "has digits in label", is better than >>> nothing). >>> >>> If someone can make a bunch of queries that do something like this and >>> actually work (and don't time out), this can be a nice beginning. >>> >>> בתאריך יום ו׳, 28 בפבר׳ 2025, 11:42, מאת Andy Mabbett < >>> [email protected]>: >>> >>>> On Fri, 28 Feb 2025 at 16:06, Romaine Wiki <[email protected]> >>>> wrote: >>>> >>>> > There are another 493k items with only one identifier and no other >>>> statement. >>>> > https://qlever.cs.uni-freiburg.de/wikidata/Z8OkZi?exec=true >>>> > Often that single identifier is just the Google Knowledge Graph ID >>>> (P2671). >>>> >>>> The first half-dozen or so I checked all also have a Wikipedia link in >>>> one or more languages. >>>> >>>> Maybe it would be worth making a query for each of the top, say twenty >>>> languages and posting on the relevant Village Pump? >>>> >>>> Or having an article or talk page template added by a bot, to each >>>> affected article? >>>> >>>> -- >>>> Andy Mabbett >>>> https://pigsonthewing.org.uk >>>> _______________________________________________ >>>> Wikidata mailing list -- [email protected] >>>> Public archives at >>>> https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/4QUE62RXUBELM5HGQRNSZDKWR2QKSH4D/ >>>> To unsubscribe send an email to [email protected] >>>> >>> _______________________________________________ >>> Wikidata mailing list -- [email protected] >>> Public archives at >>> https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/OBQ3R7C4SAHDAYD5O2D7LMCWTVJ3LX5K/ >>> To unsubscribe send an email to [email protected] >>> >> _______________________________________________ >> Wikidata mailing list -- [email protected] >> Public archives at >> https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/VIXWQD2A4OF5JSXS7ZKTRGWYHZ6Q7WC7/ >> To unsubscribe send an email to [email protected] >> > > > -- > WikiWorks · MediaWiki Consulting · http://wikiworks.com > _______________________________________________ > Wikidata mailing list -- [email protected] > Public archives at > https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/CRXPNWV2XOWKWT7JIFNV44JMPUMLPHBF/ > To unsubscribe send an email to [email protected] >
_______________________________________________ Wikidata mailing list -- [email protected] Public archives at https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/P3F677DHNUINYD5DKPJWNQ6CSTX3V3QW/ To unsubscribe send an email to [email protected]
