Another thing I would be very happy to see in the future is a greater,
systematic collaboration with Internet Archive.
I'm convinced that it's a vital part of our ecosystem, because it allow
easily a lot of things that should be done by skilled users (like create a
PDF/djvu, OCR, etc).
When a I explain Wikisource I always explain Internet Archive first,
teaching people to upload there their files, then into Commons/Wikisource
via the "IA Upload" tool.

This is why the Italian Wikisource community created a dedicated collection
on IA:
https://archive.org/details/itwikisource

To create a collection, you need at least 50 items, and then you can ask
Internet Archive to give you permission.
Right now, Alex brollo is writing some scripts that will allow a better
maintenance of the metadata,
we'll share them when they are ready.

If you create a collection, please tell us: we could even have a greater
"Wikisource" collection, that contains all the linguistic collections.

Maybe this is a bit OT for the strategy, but I think it suggests way to
improve the collaboration between us and IA.

On Fri, Mar 24, 2017 at 10:50 AM, Andrea Zanni <[email protected]>
wrote:

> Anyone else?
> It would be very good to know the gist of the discussions/opinions you are
> having in your local Wikisource.
>
> The Italian Wikisource for example is summing this up here:
> https://meta.wikimedia.org/wiki/Strategy/Wikimedia_
> movement/2017/Sources/Italian_Wikisource_Village_pump
>
> For us, there is a bit of a disagreement about the idea and goal of being
> a "library", and being a "typography": being a library is more focused on
> access, on services build upon texts (text analysis, text mining,
> searching, hyperlinking, annotation) and the transcribing/proofreading
> part, which needs a whole different level of tools and interface.
>
> Maybe you are having a similar discussion?
> Do you possibly see a "fork", in the future, of Wikisource in 2 different
> projects, or at least 2 different interfaces?
>
> Aubrey
>
> On Mon, Mar 20, 2017 at 10:54 PM, Andrea Zanni <[email protected]>
> wrote:
>
>> @Micru: of course, as you say, machine learning is the elephant in the
>> room.
>> I dream of something we could call "Wikisource as a platform":
>> meaning an environment with structured data and workflows where you can
>> have APIs
>> and tools for interact with humans and machines, both for input and for
>> output.
>> We could have OCR software that learn from our human proofreaders, and
>> ideally we could
>> even have OCRs tailored for determined centuries or types of books.
>> We could ue machine learning to look for citations within books (for
>> example other cited books or authors).¹
>> This could improve heavily our library:
>> on Internet Archive or Google Books we have millions of books that just
>> wait for us to make them
>> readable and accessible, and, of course, connect them to Wikipedia, to
>> Wikidata, to other Wikisource books.
>>
>> IMHO, this is obviously important for GLAMs:
>> we could be much more usable and easy for libraries, archives and museums
>> that want to upload into Wikisource their texts and books, and make them
>> part of our hyperlinked library.
>> They could import easily on Wikisource, and could export as well.
>> Now, this is impossible or at least very very difficult.²
>>
>> I'm not sure that all these features could go in just one project, but
>> it's probably worth trying.
>>
>> Aubrey
>>
>> [1] I remember I explored the idea with Amir, but I couldn't follow up.
>> [2] To get all the data I needed from Wikisource books, I had to
>> basically scrape the website.
>>
>> On Mon, Mar 20, 2017 at 8:14 PM, Pine W <[email protected]> wrote:
>>
>>> Glad to see this discussion. Pinging Alex Stinson for this discussion in
>>> case he has any insights to add from a GLAM perspective.
>>>
>>> Pine
>>>
>>>
>>> On Mon, Mar 20, 2017 at 7:48 AM, David Cuenca Tudela <[email protected]>
>>> wrote:
>>>
>>>> On Sun, Mar 19, 2017 at 9:44 PM, Asaf Bartov <[email protected]>
>>>> wrote:
>>>>
>>>>> what might be the significant role our unique advantage might play in
>>>>> 15 years?
>>>>>
>>>>
>>>> There are some circumstantial aspects that might be relevant for
>>>> Wikisource:
>>>> - With the emergence of machine learning, do volunteers really need to
>>>> spend so much time formatting? Or will we able to use our data to train a
>>>> system to do some pre-formatting for us?
>>>> - With the existing flood of data, can we consider ws as a relevancy
>>>> setter? If a document has been transcribed/imported into wikisource, is
>>>> that enough to make the document relevant?
>>>> - Considering that not all libraries might have the resources to
>>>> develop their own platform, can Wikisource be used as a neutral platform by
>>>> external agents as a complement to their own infrastructure?
>>>>
>>>> Regarding the 15 years time frame, it might be a good exercise to
>>>> examine different scenarios. Yes, one could be to think big, to expect
>>>> growth and a favorable environment. But what about the opposite? What if
>>>> there are *less* people able to contribute?
>>>>
>>>> Cheers,
>>>> Micru
>>>>
>>>>
>>>> _______________________________________________
>>>> Wikisource-l mailing list
>>>> [email protected]
>>>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>>>
>>>>
>>>
>>> _______________________________________________
>>> Wikisource-l mailing list
>>> [email protected]
>>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>>
>>>
>>
>
_______________________________________________
Wikisource-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikisource-l

Reply via email to