Thank you Nahum,

Could you indicate which OCR solution you are using?

Le 26/03/2018 à 17:27, Nahum Wengrov a écrit :
I frequently work offline on he.wikisource. I download the entire pdf file from commons to my hard drive, and OCR the page I need myself. One can use the OCR of wikisource and download the text too, I guess, page by page. Then I proof the text in a Word document, open to the lower half of my screen, with the pdf open on the upper half of the screen, where I go to the page I need with acrobat reader, and scroll both windows down or up as needed.

On Mon, Mar 26, 2018 at 11:21 AM, mathieu stumpf guntz < <>> wrote:

    Le 24/03/2018 à 16:22, billinghurst a écrit :
    Though that would defeat the purpose of online proofreading with
    account verification. Some of the true value of our online
    process is that contribution builds a level of trust and
    knowledge and that is reflected in both our patrolling and the
    allocation of autopatrolled status.
    How providing tools to make batch work offline would interfere in
    anyway with that? Once the work is done, it can be uploaded to
    Wikisource with whichever account the user want.

    Actually, to my mind, the main benefit of the online aspect is the
    peer to peer production model. Also there is no need of a central
    node carrying accounts to take into account the trust given to a
    particular contributor. There is digital signature technologies
    such as gpg for example. Having a central node with a web
    interface just makes things easier for most users, it doesn't
    improve the trustability of the environment. On the contrary, with
    a single point of failure, we actually rely on a weaker solution
    on this regard.

     Also how would you have access to templates, and components like
    that from off-line?
    Well, that just show how innefecient are this tools to continue to
    contribute while being offline. It's allways possible to install
    Mediawiki and download required templates, but currently this
    process seems way to complicated, doesn't it.

    Also we generally cannot download the images separately as that
    is usually part of the later clean-up where people have the
    technical skills.
    I'm afraid the term "image" misguided your answer. It's seems you
    interpreted that as picture elements from files, while I was
    talking about this files themselves.

    So yes, there is the capacity to have the text and proofread the
    text, that actual checking the text against the image is not the
    sole component of proofreading, and further it would not be at
    all helpful for validation.
    There is nothing magic about working directly in a browser. People
    do download and upload all the required material anyway, but on a
    page per page base. The result is just as valid as it is done when
    transactions are operated on a file repository level.


    Wikisource-l mailing list

Wikisource-l mailing list

Wikisource-l mailing list

Reply via email to