Zbyszko added a comment.
First M-entity mentioned in the ticket was missing because there was a bug
with weekly reloads, that has now been fixed - entries added before the reload
should be available.
As for the second part of the ticket - sdc has uri encoded contentUrl
(although apparently MD5 is calculated from decoded filename). I modified the
query to match that fact:
SELECT (COUNT (DISTINCT ?image) AS ?images) (COUNT(DISTINCT ?file) AS
?files)
WITH
{
SELECT ?image ?contentUrl
WHERE
{
SERVICE <https://query.wikidata.org/sparql>
{
?item wdt:P31 wd:Q5153359 .
?item wdt:P18 ?image .
}
BIND (REPLACE(wikibase:decodeUri(SUBSTR(STR(?image), 52)), " ", "_") AS
?filename)
BIND (REPLACE(SUBSTR(STR(?image), 52), "%20", "_") AS
?filenameUnencoded)
BIND (MD5(?filename) AS ?MD5)
BIND (URI(CONCAT("https://upload.wikimedia.org/wikipedia/commons/",
SUBSTR(?MD5, 1, 1), "/", SUBSTR(?MD5, 1, 2), "/",
?filenameUnencoded)) As ?contentUrl)
}
} AS %get_some_images_from_Wikidata
WHERE
{
INCLUDE %get_some_images_from_Wikidata
OPTIONAL { ?file schema:contentUrl ?contentUrl . }
}
This yields (at the time of writing):
| images | files |
| ------ | ----- |
| 6326 | 6318 |
|
Which leaves 8 unaccounted for. Out of those, 7 do not have structured data
defined and last one has a new structured data content, so possibly it wasn't
present in the latest dump we reload from.
Please, let me know if this resolves the issue.
TASK DETAIL
https://phabricator.wikimedia.org/T269302
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: Zbyszko
Cc: CBogen, Lydia_Pintscher, Vojtech.dostal, Dipsacus_fullonum, Aklapper,
GFontenelle_WMF, MPhamWMF, FRomeo_WMF, Muchiri124, Nintendofan885, Akuckartz,
Nandana, JKSTNK, Namenlos314, Lahi, Gq86, E1presidente, Ramsey-WMF, Cparle,
Anooprao, SandraF_WMF, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden,
EBjune, Tramullas, Acer, merbst, LawExplorer, Salgo60, Silverfish, Poyekhali,
_jensen, rosalieper, Taiwania_Justo, Scott_WUaS, Jonas, Xmlizer, Susannaanas,
Ixocactus, Wong128hk, abian, Jane023, jkroll, Wikidata-bugs, Jdouglas, Base,
matthiasmullie, aude, Tobias1984, El_Grafo, Dinoguy1000, Manybubbles,
Ricordisamoa, Wesalius, Raymond, Steinsplitter, Mbch331, Keegan
_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs