On Sat, Apr 1, 2023 at 11:36 PM Erik Moeller <eloque...@gmail.com> wrote:
> Openly licensed models for machine translation like Facebook's M2M
> (https://huggingface.co/facebook/m2m100_418M) or text generation like
> Cerebras-GPT-13B (https://huggingface.co/cerebras/Cerebras-GPT-13B)
> and GPT-NeoX-20B (https://huggingface.co/EleutherAI/gpt-neox-20b) seem
> like better targets for running on Wikimedia infrastructure, if
> there's any merit to be found in running them at this stage.
>
> Note that Facebook's proprietary but widely circulated LLaMA model has
> triggered a lot of work on dramatically improving performance of LLMs
> through more efficient implementations, to the point that you can run
> a decent quality LLM (and combine it with OpenAI's freely licensed
> voice detection model) on a consumer grade laptop:
>
> https://github.com/ggerganov/llama.cpp
>
> While I'm not sure if the "hallucination" problem is tractable when
> all you have is an LLM, I am confident (based on, e.g., the recent
> results with Alpaca: https://crfm.stanford.edu/2023/03/13/alpaca.html)
> that the performance of smaller models will continue to increase as we
> find better ways to train, steer, align, modularize and extend them.

to host open models like above would be really
cool for multiple reasons, the most important one to bring
back the openess into the training, besides the many
voices out of the movement considering various social
aspects one would never have the idea of otherwise.

rupert
_______________________________________________
Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/55MB54MTGLIIUPRGKJI2UUPHYFXV6AHT/
To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org

Reply via email to