There's also this new Phab task, that's looking at a more limited
first-step:


Investigation: Could we build a Tool Labs project to generate Djvu files for 
WikiSource
https://phabricator.wikimedia.org/T154538







On Tue, 3 Jan 2017, at 07:46 AM, Alex Brollo wrote:

> You can see a great advantage of djvu files over pdf files into the
> present file list of any IA item. You can see that IA removed djvu
> files, but it builds and publishes _djvu.xml file. Why?  I presume
> that IA uses that file to "map words" into its book viewer, since it
> has a good text structure while being *pretty simple*. It can be
> translated into hOCR, and editing its text nodes the edited text can
> be uploaded again into the djvu file. Itsource is testing, on some
> texts, tricks to mass-fix djvu text layer (removing scannos etc.)
> *before* uploading it into Commons.
> 

> It's a pity IMHO that this magic book format has been disregarded. Its
> structure is *open* just as the pdf structure is *closed*.
> 

> Alex

> 

> 

> 

> 2017-01-03 0:19 GMT+01:00 Sam Wilson <[email protected]>:

>> __

>> I wonder if, rather than creating a new IA item, we should just
>> link the original IA item to the DjVu on Commons (via a review)? Or
>> is there a discoverability benefit to be had by having the DjVu
>> also on IA?
>> 

>> 

>> On Tue, 3 Jan 2017, at 07:07 AM, Sam Wilson wrote:

>>> Good idea. I guess it's not ideal to end up with two items, but at
>>> least the 2nd will be updateable from our end.
>>> 

>>> It looks like we can add HTML links to IA reviews too, which is
>>> nice: https://archive.org/details/spinoza_etica_paravia
>>> 

>>> 

>>> On Mon, 2 Jan 2017, at 11:52 PM, Alex Brollo wrote:

>>>> Done :-)

>>>> 

>>>> Alex

>>>> 

>>>> 2017-01-02 16:49 GMT+01:00 Alex Brollo <[email protected]>:

>>>>> Please take a look to
>>>>> https://archive.org/details/spinoza_etica_paravia_djvu, this is
>>>>> precisely a djvu-only item that I uploaded some days ago. I asked
>>>>> for permission to create "djvu-only items" into IA forum and I got
>>>>> it; this is the fiirst item I created; as you see there's some
>>>>> "implicit convention" too (the name of item is the original one +
>>>>> a _djvu suffix: it has been derived from
>>>>> https://archive.org/details/spinoza_etica_paravia) and metadata
>>>>> are the same, but a standard warning "Derived from files into
>>>>> L'Etica[1]" into the description field.
>>>>> 

>>>>> So far I did not do the last step, t.i. adding a "backlink" from
>>>>> original item to the derived one.
>>>>> 

>>>>> internetarchive.py allows to automatize the whole work (to
>>>>> download metadata of source item, to build the new item name and
>>>>> to add the warning do description field and to upload the new
>>>>> item).
>>>>> 

>>>>> 

>>> 

>>> 

>>> _________________________________________________

>>> Wikisource-l mailing list

>>> [email protected]

>>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l

>>> 

>> 

>> 

>> _______________________________________________

>>  Wikisource-l mailing list

>> [email protected]

>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l

>> 

> _________________________________________________

> Wikisource-l mailing list

> [email protected]

> https://lists.wikimedia.org/mailman/listinfo/wikisource-l




Links:

  1. https://archive.org/details/spinoza_etica_paravia
_______________________________________________
Wikisource-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Reply via email to