[Wikitech-l] Re: ORES To Lift Wing Migration

Aaron Halfaker Fri, 22 Sep 2023 08:54:21 -0700

Do you have a tag for filing bugs against ORES-legacy?  I can't seem to
find a relevant one in phab.


On Fri, Sep 22, 2023 at 8:39 AM Luca Toscano <ltosc...@wikimedia.org> wrote:

> Hi Aaron!
>
> Thanks for following up. The API is almost compatible with what ORES
> currently does, but there are limitations (like the max number of revisions
> in a batch etc..). The API clearly states when something is not supported,
> so you can check its compatibility now making some requests to:
>
> https://ores-legacy.wikimedia.org
>
>  If you open a task with a list of systems that you need to migrate we can
> definitely take a look and help. So far the traffic being served by ORES
> has been reduced to few clients, and all of them don't run with
> recognizable UAs (see https://meta.wikimedia.org/wiki/User-Agent_policy)
> so we'll try our best to support them. The migration to Lift Wing has been
> widely publicized, a lot of documentation is available to migrate. We'd
> suggest trying Lift Wing for your systems instead (see
> https://wikitech.wikimedia.org/wiki/Machine_Learning/LiftWing/Usage).
>
> The Machine Learning plan is to eventually deprecate ores-legacy too, to
> maintain only one system (namely Lift Wing). There is no final date yet,
> we'll try to reach out to all remaining users first, so if you plan to keep
> using ores-legacy please follow up with us first :)
>
> Thanks!
>
> Luca (on behalf of the ML Team)
>
> On Fri, Sep 22, 2023 at 5:10 PM Aaron Halfaker <aaron.halfa...@gmail.com>
> wrote:
>
>> Does the new ores-legacy support the same feature set.  E.g. features
>> output, injection, and threshold optimizations.  Or is it just prediction?
>> This will affect some of the systems I need to migrate.
>>
>> On Fri, Sep 22, 2023, 06:21 Ilias Sarantopoulos <
>> isarantopou...@wikimedia.org> wrote:
>>
>>> Hello!
>>>
>>>
>>> As a next step in the deprecation process of ORES
>>> https://wikitech.wikimedia.org/wiki/ORES the Machine Learning team will
>>> switch the backend of ores.wikimedia.org to ores-legacy, a k8s
>>> application meant to provide a compatibility layer between ORES and Lift
>>> Wing so users that have not yet migrated to Lift Wing will be
>>> transparently migrated. Ores-legacy is an application that has the same API
>>> as ORES but in the background makes requests to Lift Wing, allowing us to
>>> decommission the ORES servers until all clients have moved.
>>>
>>> This change is planned to take place on Monday 25th of September. If
>>> you have a client/application that is still using ORES we expect that this
>>> switch is going to be transparent for you.
>>>
>>> However keep in mind that ores-legacy is not a 100% replacement for ORES
>>> as some old and unused features are no longer supported.
>>>
>>> If you see anything out of the ordinary, feel free to contact the
>>> Machine Learning team:
>>>
>>> IRC libera: #wikimedia-ml
>>>
>>> Phabricator: Machine-Learning-team tag
>>>
>>> Thank you!
>>>
>>>
>>> On Wed, Aug 9, 2023 at 1:22 PM Chaloemphon Praphuchakang <
>>> yoshrakpra...@gmail.com> wrote:
>>>
>>>>
>>>> On Tue, 8 Aug 2023, 10:45 Tilman Bayer, <haebw...@gmail.com> wrote:
>>>>
>>>>>
>>>>> Hi Chris,
>>>>>
>>>>> On Mon, Aug 7, 2023 at 11:51 AM Chris Albon <cal...@wikimedia.org>
>>>>> wrote:
>>>>>
>>>>>> Hi Tilman,
>>>>>>
>>>>>> Most of the work is still very experimental. We have hosted a few
>>>>>> LLMs on Lift Wing already (StarCoder for example) but they were just
>>>>>> running on CPU, far too slow for real use cases. But it proves that we 
>>>>>> can
>>>>>> easily host LLMs on Lift Wing. We have been pretty quiet about it while 
>>>>>> we
>>>>>> focus on the ORES migration, but it is our next big project. More soon
>>>>>> hopefully!
>>>>>>
>>>>> Understood. Looking forward to learning more later!
>>>>>
>>>>>
>>>>>> Where we are now is that we have budget for a big GPU purchase
>>>>>> (~10-20 GPUs depending on cost), the question we will try to answer after
>>>>>> the ORES migration is complete is: what GPUs should we purchase? We are
>>>>>> trying to balance our strong preference to stay open source (i.e. AMD 
>>>>>> mROC)
>>>>>> in a world dominated by a single closed source vendor (i.e. Nvidia). In
>>>>>> addition, do we go for a few expensive GPUs better suited to LLMs (A1000,
>>>>>> H100, etc) or a mix of big and small? We will need to figure out all 
>>>>>> this.
>>>>>>
>>>>> I see. On that matter, what do you folks make of the recent
>>>>> announcements of AMD's partnerships with Hugging Face and Pytorch[5]?
>>>>> (which, I understand, came after the ML team had already launched the
>>>>> aforementioned new AMD explorations)
>>>>>
>>>>> "Open-source AI: AMD looks to Hugging Face and Meta spinoff PyTorch to
>>>>> take on Nvidia [...]
>>>>> Both partnerships involve AMD’s ROCm AI software stack, the company’s
>>>>> answer to Nvidia’s proprietary CUDA platform and application-programming
>>>>> interface. AMD called ROCm an open and portable AI system with
>>>>> out-of-the-box support that can port to existing AI models. [...B]oth AMD
>>>>> and Hugging Face are dedicating engineering resources to each other and
>>>>> sharing data to ensure that the constantly updated AI models from Hugging
>>>>> Face, which might not otherwise run well on AMD hardware, would be
>>>>> “guaranteed” to work on hardware like the MI300X. [...] AMD said PyTorch
>>>>> will fully upstream the ROCm software stack and “provide immediate ‘day
>>>>> zero’ support for PyTorch 2.0 with ROCm release 5.4.2 on all AMD Instinct
>>>>> accelerators,” which is meant to appeal to those customers looking to
>>>>> switch from Nvidia’s software ecosystem."
>>>>>
>>>>>
>>>>> In their own announcement, Hugging Face offered further details,
>>>>> including a pretty impressive list of models to be supported:[6]
>>>>>
>>>>>
>>>>> "We intend to support state-of-the-art transformer architectures for
>>>>> natural language processing, computer vision, and speech, such as BERT,
>>>>> DistilBERT, ROBERTA, Vision Transformer, CLIP, and Wav2Vec2. Of course,
>>>>> generative AI models will be available too (e.g., GPT2, GPT-NeoX, T5, OPT,
>>>>> LLaMA), including our own BLOOM and StarCoder models. Lastly, we will also
>>>>> support more traditional computer vision models, like ResNet and ResNext,
>>>>> and deep learning recommendation models, a first for us. [..] We'll do our
>>>>> best to test and validate these models for PyTorch, TensorFlow, and ONNX
>>>>> Runtime for the above platforms. [...] We will integrate the AMD ROCm SDK
>>>>> seamlessly in our open-source libraries, starting with the transformers
>>>>> library."
>>>>>
>>>>>
>>>>> Do you think this may promise too much, or could it point to a
>>>>> possible solution of the Foundation's conundrum?
>>>>> In any case, this seems to be an interesting moment where many in AI
>>>>> are trying to move away from Nvidia's proprietary CUDA platform. Most of
>>>>> them probably more for financial and availability reasons though, given 
>>>>> the
>>>>> current GPU shortages[7] (which the ML team is undoubtedly aware of
>>>>> already; mentioning this as context for others on this list. See also
>>>>> Marketwatch's remarks about current margins[5]).
>>>>>
>>>>> Regards, Tilman
>>>>>
>>>>>
>>>>> [5]
>>>>> https://archive.ph/2023.06.15-173527/https://www.marketwatch.com/amp/story/open-source-ai-amd-looks-to-hugging-face-and-meta-spinoff-pytorch-to-take-on-nvidia-e4738f87
>>>>> [6] https://huggingface.co/blog/huggingface-and-amd
>>>>> [7] See e.g.
>>>>> https://gpus.llm-utils.org/nvidia-h100-gpus-supply-and-demand/ (avoid
>>>>> playing the song though. Don't say I didn't warn you)
>>>>>
>>>>>
>>>>>> I wouldn't characterize WMF's Language Team using CPU as because of
>>>>>> AMD, rather at the time we didn't have the budget for GPUs so Lift Wing
>>>>>> didn't have any. Since then we have moved two GPUs onto Lift Wing for
>>>>>> testing but they are pretty old (2017ish). Once we make the big GPU
>>>>>> purchase Lift Wing will gain a lot of functionality for LLM and similar
>>>>>> models.
>>>>>>
>>>>>> Chris
>>>>>>
>>>>>> On Sun, Aug 6, 2023 at 9:57 PM Tilman Bayer <haebw...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> On Thu, Aug 3, 2023 at 7:16 AM Chris Albon <cal...@wikimedia.org>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi everybody,
>>>>>>>>
>>>>>>>> TL;DR We would like users of ORES models to migrate to our new open
>>>>>>>> source ML infrastructure, Lift Wing, within the next five months. We 
>>>>>>>> are
>>>>>>>> available to help you do that, from advice to making code commits. It 
>>>>>>>> is
>>>>>>>> important to note: All ML models currently accessible on ORES are also
>>>>>>>> currently accessible on Lift Wing.
>>>>>>>>
>>>>>>>> As part of the Machine Learning Modernization Project (
>>>>>>>> https://www.mediawiki.org/wiki/Machine_Learning/Modernization),
>>>>>>>> the Machine Learning team has deployed a Wikimedia’s new machine 
>>>>>>>> learning
>>>>>>>> inference infrastructure, called Lift Wing (
>>>>>>>> https://wikitech.wikimedia.org/wiki/Machine_Learning/LiftWing).
>>>>>>>> Lift Wing brings a lot of new features such as support for GPU-based
>>>>>>>> models, open source LLM hosting, auto-scaling, stability, and ability 
>>>>>>>> to
>>>>>>>> host a larger number of models.
>>>>>>>>
>>>>>>>
>>>>>>> This sounds quite exciting! What's the best place to read up on that
>>>>>>> planned support for GPU-based models and open source LLMs? (I also saw 
>>>>>>> in
>>>>>>> the recent NYT article[1] that the team is "in the process of adapting 
>>>>>>> A.I.
>>>>>>> models that are 'off the shelf; — essentially models that have been made
>>>>>>> available by researchers for anyone to freely customize — so that
>>>>>>> Wikipedia’s editors can use them for their work.")
>>>>>>>
>>>>>>> I'm aware of the history[2] of not being able to use NVIDIA GPUs due
>>>>>>> to their CUDA drivers being proprietary. It was mentioned recently in 
>>>>>>> the
>>>>>>> Wikimedia AI Telegram group that this is still a serious limitation,
>>>>>>> despite some new explorations with AMD GPUs[3] - to the point that e.g. 
>>>>>>> the
>>>>>>> WMF's Language team has resorted to using models without GPU support 
>>>>>>> (CPU
>>>>>>> only).[4]
>>>>>>> It sounds like there is reasonable hope that this situation could
>>>>>>> change fairly soon? Would it also mean both at the same time, i.e. open
>>>>>>> source LLMs running with GPU support (considering that at least some
>>>>>>> well-known ones appear to require torch.cuda.is_available() == True for
>>>>>>> that)?
>>>>>>>
>>>>>>> Regards, Tilman
>>>>>>>
>>>>>>> [1]
>>>>>>> https://www.nytimes.com/2023/07/18/magazine/wikipedia-ai-chatgpt.html
>>>>>>> [2]
>>>>>>> https://techblog.wikimedia.org/2020/04/06/saying-no-to-proprietary-code-in-production-is-hard-work-the-gpu-chapter/
>>>>>>> [3] https://phabricator.wikimedia.org/T334583 etc.
>>>>>>> [4]
>>>>>>> https://diff.wikimedia.org/2023/06/13/mint-supporting-underserved-languages-with-open-machine-translation/
>>>>>>> or https://thottingal.in/blog/2023/07/21/wikiqa/ (experimental but,
>>>>>>> I understand, written to be deployable on WMF infrastructure)
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> With the creation of Lift Wing, the team is turning its attention
>>>>>>>> to deprecating the current machine learning infrastructure, ORES. ORES
>>>>>>>> served us really well over the years, it was a successful project but 
>>>>>>>> it
>>>>>>>> came before radical changes in technology like Docker, Kubernetes and 
>>>>>>>> more
>>>>>>>> recently MLOps. The servers that run ORES are at the end of their 
>>>>>>>> planned
>>>>>>>> lifespan and so to save cost we are going to shut them down in early 
>>>>>>>> 2024.
>>>>>>>>
>>>>>>>> We have outlined a deprecation path on Wikitech (
>>>>>>>> https://wikitech.wikimedia.org/wiki/ORES), please read the page if
>>>>>>>> you are a maintainer of a tool or code that uses the ORES endpoint
>>>>>>>> https://ores.wikimedia.org/). If you have any doubt or if you need
>>>>>>>> assistance in migrating to Lift Wing, feel free to contact the ML team 
>>>>>>>> via:
>>>>>>>>
>>>>>>>> - Email: m...@wikimedia.org
>>>>>>>> - Phabricator: #Machine-Learning-Team tag
>>>>>>>> - IRC (Libera): #wikimedia-ml
>>>>>>>>
>>>>>>>> The Machine Learning team is available to help projects migrate,
>>>>>>>> from offering advice to making code commits. We want to make this as 
>>>>>>>> easy
>>>>>>>> as possible for folks.
>>>>>>>>
>>>>>>>> High Level timeline:
>>>>>>>>
>>>>>>>> **By September 30th 2023: *Infrastructure powering the ORES API
>>>>>>>> endpoint will be migrated from ORES to Lift Wing. For users, the API
>>>>>>>> endpoint will remain the same, and most users won’t notice any change.
>>>>>>>> Rather just the backend services powering the endpoint will change.
>>>>>>>>
>>>>>>>> Details: We'd like to add a DNS CNAME that points
>>>>>>>> ores.wikimedia.org to ores-legacy.wikimedia.org, a new endpoint
>>>>>>>> that offers a almost complete replacement of the ORES API calling Lift 
>>>>>>>> Wing
>>>>>>>> behind the scenes. In an ideal world we'd migrate all tools to Lift 
>>>>>>>> Wing
>>>>>>>> before decommissioning the infrastructure behind ores.wikimedia.org,
>>>>>>>> but it turned out to be really challenging so to avoid disrupting 
>>>>>>>> users we
>>>>>>>> chose to implement a transition layer/API.
>>>>>>>>
>>>>>>>> To summarize, if you don't have time to migrate before September to
>>>>>>>> Lift Wing, your code/tool should work just fine on
>>>>>>>> ores-legacy.wikimedia.org and you'll not have to change a line in
>>>>>>>> your code thanks to the DNS CNAME. The ores-legacy endpoint is not a 
>>>>>>>> 100%
>>>>>>>> replacement for ores, we removed some very old and not used features, 
>>>>>>>> so we
>>>>>>>> highly recommend at least test the new endpoint for your use case to 
>>>>>>>> avoid
>>>>>>>> surprises when we'll make the switch. In case you find anything weird,
>>>>>>>> please report it to us using the aforementioned channels.
>>>>>>>>
>>>>>>>> **September to January: *We will be reaching out to every user of
>>>>>>>> ORES we can identify and working with them to make the migration 
>>>>>>>> process as
>>>>>>>> easy as possible.
>>>>>>>>
>>>>>>>> **By January 2024: *If all goes well, we would like zero traffic
>>>>>>>> on the ORES API endpoint so we can turn off the ores-legacy API.
>>>>>>>>
>>>>>>>> If you want more information about Lift Wing, please check
>>>>>>>> https://wikitech.wikimedia.org/wiki/Machine_Learning/LiftWing
>>>>>>>>
>>>>>>>> Thanks in advance for the patience and the help!
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>>
>>>>>>>> The Machine Learning Team
>>>>>>>> _______________________________________________
>>>>>>>> Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
>>>>>>>> To unsubscribe send an email to
>>>>>>>> wikitech-l-le...@lists.wikimedia.org
>>>>>>>>
>>>>>>>> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
>>>>>>> To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
>>>>>>>
>>>>>>> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
>>>>>>
>>>>>> _______________________________________________
>>>>>> Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
>>>>>> To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
>>>>>>
>>>>>> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
>>>>>
>>>>> _______________________________________________
>>>>> Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
>>>>> To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
>>>>>
>>>>> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
>>>>
>>>> _______________________________________________
>>>> Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
>>>> To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
>>>>
>>>> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
>>>
>>> _______________________________________________
>>> Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
>>> To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
>>>
>>> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
>>
>> _______________________________________________
>> Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
>> To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
>>
>> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
>
> _______________________________________________
> Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
> To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/

_______________________________________________
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/

[Wikitech-l] Re: ORES To Lift Wing Migration

Reply via email to