Re: [Wikisource-l] archive.org switched to ABBYY FineReader 9.0

Federico Leva (Nemo) Thu, 10 Oct 2013 23:48:56 -0700

Dispenser kindly made a list of DjVu files on Commons linking an IAitem, with some information like global usage:https://toolserver.org/~dispenser/temp/djvu2archive.org.txt (just changethe extension to csv to open it as a spreadsheet, tab-separated).It's about 5000 books with 6-200 global usages and 5000 outside thatrange (which probably means completely unused apart some talk pages orwhatever, or with most text already living on wiki pages).If I manage to convince a "slash-admin", I'll get those 5000 re-OCR'd,otherwise I need to do it manually so suggestions on priorities arewelcome. :)


Nemo


P.s.:

The used query http://pastebin.com/L4EXDY5F and another onehttp://pastebin.com/avg3LYti


-- 2> /dev/null; date; echo '
SELECT /*SLOW_OK*/
  CONCAT("[[:File:", REPLACE(img_name, "_", " "), "]]") AS "File",
  el_from,
  img_size,
  REPLACE(img_user_text, "_", " ") AS "Uploader",
  img_timestamp AS "Timestamp",
  el_to,

(SELECT COUNT(*) FROM globalimagelinks WHERE gil_to=page_title) AS"Usage",

  (SELECT COUNT(*) FROM oldimage WHERE oi_name=page_title) AS "Reuploads",
  SUBSTRING_INDEX(el_to,"/",-1) AS "Archive_Name"
FROM image
JOIN page ON page_namespace=6 AND page_title=CONVERT(img_name USING latin1)
JOIN externallinks ON el_from=page_id
WHERE img_media_type="BITMAP"
AND img_major_mime="image" AND img_minor_mime="vnd.djvu"
AND (el_index LIKE "http://org.archive.%/details/_%";
OR  el_index LIKE "https://org.archive.%/details/_%";)

;-- ' | sql -r commonswiki_p > ~/public_html/temp/djvu2archive.org.txt;date;


_______________________________________________
Wikisource-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Re: [Wikisource-l] archive.org switched to ABBYY FineReader 9.0

Reply via email to