[Wikisource-l] Converting pdf files into wiki markup

David Cuenca Wed, 12 Jun 2013 10:05:01 -0700

It is not a trivial matter. The best bet would be to take an existing pdf
import tool for a word processor, and try to write a similar tool for
wikitext.


There is the Oracle PDF Import Extension for Open Office, the code can be
browsed, maybe it can give you some ideas
http://extensions.services.openoffice.org/project/pdfimport

Micru

On Wed, Jun 12, 2013 at 12:38 PM, Alex Brollo <[email protected]> wrote:

> When we tried to convert into wiki code (a needed step to add links and to
> convert files into a "wiki hypertext") a pdf file, that's a opaque, closed
> format, such a work turned off in a nightmare. If we simply load free pdf
> books "as they are", I don't see any advantage, but "feed wikisource
> numbers/statistics" nd this in presently far from my personal interest.
>
> As you guess, I'm one of users who don't support Aubrey's enthusiasm about
>  texts born digital, even if free. :-)
>
> Alex
>
>
> 2013/6/12 David Cuenca <[email protected]>
>
>> Nobody is saying anything about using copyrighted works, there are many
>> books that have an open license that would allow to include them in
>> Wikisource.
>>
>> For instance in ca-ws we have this translation from 2009:
>>
>> http://ca.wikisource.org/wiki/Llibre:El_secret_de_l%E2%80%99or_que_creix_%282009%29.djvu
>>
>> The original is in the PD, and the translator gave away his rights. It
>> would have been much easier to work directly with the pdf, instead of
>> converting to djvu.
>>
>> Micru
>>
>>
>> On Wed, Jun 12, 2013 at 10:47 AM, Aarti K. Dwivedi <
>> [email protected]> wrote:
>>
>>> If I am not wrong, as of today, most books that were born digital, are
>>> still under copyright. Of course, they are available freely on the
>>> internet. But we can't use the pirated copies. How would we go about the
>>> procurement of these books?
>>> If we procure these copyrighted books, then the only we would have to do
>>> is to check for proper formatting. Isn't it?
>>>
>>>
>>> On Wed, Jun 12, 2013 at 7:58 PM, Lars Aronsson <[email protected]> wrote:
>>>
>>>> On 06/12/2013 02:48 PM, Andrea Zanni wrote:
>>>>
>>>>> We could define some tasks as
>>>>> * corrected the page
>>>>> * OPTIONAL added optional templates/links/annotations
>>>>> *...
>>>>>
>>>>
>>>> Geotagged all the photos, ...
>>>>
>>>> The list doesn't end. You need a generic mechanism
>>>> for any new feature you can invent. But aren't our
>>>> existing templates and categories the best way to
>>>> do this? You could just add to each page:
>>>> {{done|proofread=user1|**validated=user2|geotagged=**user4|...}}
>>>>
>>>>
>>>> --
>>>>   Lars Aronsson ([email protected])
>>>>   Project Runeberg - free Nordic literature - http://runeberg.org/
>>>>
>>>>
>>>>
>>>>
>>>> ______________________________**_________________
>>>> Wikisource-l mailing list
>>>> [email protected].**org <[email protected]>
>>>> https://lists.wikimedia.org/**mailman/listinfo/wikisource-l<https://lists.wikimedia.org/mailman/listinfo/wikisource-l>
>>>>
>>>
>>>
>>>
>>> --
>>> Aarti K. Dwivedi
>>>
>>>
>>> _______________________________________________
>>> Wikisource-l mailing list
>>> [email protected]
>>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>>
>>>
>>
>>
>> --
>> Etiamsi omnes, ego non
>> _______________________________________________
>> Wikisource-l mailing list
>> [email protected]
>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>
>>
>
> _______________________________________________
> Wikisource-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>


-- 
Etiamsi omnes, ego non

_______________________________________________
Wikisource-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

[Wikisource-l] Converting pdf files into wiki markup

Reply via email to