Multichill added a comment.

It's doable, but not easy. Wikidata has a different structure.

Extending the current bot seems to be the most future proof solution. In this task we only care about getting things into the archive, nothing else. So my guess is that parsing a Wikidata item is what you run into? Take for example https://www.wikidata.org/wiki/Q24066189 . You could force it to some other format like https://www.wikidata.org/entity/Q24066189.rdf or https://www.wikidata.org/entity/Q24066189.json to make it easier to find and extract urls.

Do you have some pointers where you think the challenge is going to be? We have an upcoming hackathon and we might be able to work on this.


TASK DETAIL
https://phabricator.wikimedia.org/T143488

EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Multichill
Cc: Multichill, Abbe98, Lydia_Pintscher, Micru, Sadads, Cyberpower678, Izno, Aklapper, abian, Nandana, tabish.shaikh91, Lahi, Gq86, GoranSMilovanovic, Soteriaspace, Jayprakash12345, JakeTheDeveloper, QZanden, dachary, merbst, LawExplorer, D3r1ck01, Wikidata-bugs, Hydriz, aude, Ricordisamoa, Sjoerddebruin, TheDJ, Mbch331, Jay8g
_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to