[Wikidata-bugs] [Maniphest] [Commented On] T143488: Save contents of URLs linked from Wikidata in the Internet Archive

Multichill Wed, 17 Oct 2018 05:09:55 -0700

Multichill added a comment.

In T143488#4132389, @Cyberpower678 wrote:

It's doable, but not easy. Wikidata has a different structure.

Extending the current bot seems to be the most future proof solution. In this task we only care about getting things into the archive, nothing else. So my guess is that parsing a Wikidata item is what you run into? Take for example https://www.wikidata.org/wiki/Q24066189 . You could force it to some other format like https://www.wikidata.org/entity/Q24066189.rdf or https://www.wikidata.org/entity/Q24066189.json to make it easier to find and extract urls.

Do you have some pointers where you think the challenge is going to be? We have an upcoming hackathon and we might be able to work on this.

TASK DETAIL

https://phabricator.wikimedia.org/T143488

EMAIL PREFERENCES

https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Multichill
Cc: Multichill, Abbe98, Lydia_Pintscher, Micru, Sadads, Cyberpower678, Izno, Aklapper, abian, Nandana, tabish.shaikh91, Lahi, Gq86, GoranSMilovanovic, Soteriaspace, Jayprakash12345, JakeTheDeveloper, QZanden, dachary, merbst, LawExplorer, D3r1ck01, Wikidata-bugs, Hydriz, aude, Ricordisamoa, Sjoerddebruin, TheDJ, Mbch331, Jay8g

_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

[Wikidata-bugs] [Maniphest] [Commented On] T143488: Save contents of URLs linked from Wikidata in the Internet Archive

Reply via email to