[Wikidata-bugs] [Maniphest] [Commented On] T143488: Save contents of URLs linked from Wikidata in the Internet Archive

Cyberpower678 Thu, 18 Oct 2018 08:13:15 -0700

Cyberpower678 added a comment.

In T143488#4673884, @Multichill wrote:

In T143488#4132389, @Cyberpower678 wrote:

It's doable, but not easy. Wikidata has a different structure.

Extending the current bot seems to be the most future proof solution. In this task we only care about getting things into the archive, nothing else. So my guess is that parsing a Wikidata item is what you run into? Take for example https://www.wikidata.org/wiki/Q24066189 . You could force it to some other format like https://www.wikidata.org/entity/Q24066189.rdf or https://www.wikidata.org/entity/Q24066189.json to make it easier to find and extract urls.

Do you have some pointers where you think the challenge is going to be? We have an upcoming hackathon and we might be able to work on this.

I am currently at that Hackathon in Thompson 150 right now. If you care to meet me during lunch break I will be happy to work on this with you.

TASK DETAIL

https://phabricator.wikimedia.org/T143488

EMAIL PREFERENCES

https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Cyberpower678
Cc: Jane023, Multichill, Abbe98, Lydia_Pintscher, Micru, Sadads, Cyberpower678, Izno, Aklapper, abian, Nandana, tabish.shaikh91, Lahi, Gq86, GoranSMilovanovic, Soteriaspace, Jayprakash12345, JakeTheDeveloper, QZanden, dachary, merbst, LawExplorer, D3r1ck01, Wikidata-bugs, Hydriz, aude, Ricordisamoa, Sjoerddebruin, TheDJ, Mbch331, Jay8g

_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

[Wikidata-bugs] [Maniphest] [Commented On] T143488: Save contents of URLs linked from Wikidata in the Internet Archive

Reply via email to