[Wikitech-l] Re: ORES To Lift Wing Migration

Luca Toscano Fri, 22 Sep 2023 09:07:35 -0700

Let's discuss the issue in a Phabricator task, it seems more appropriate
than here (so other folks can chime in etc.. more easily).


>From our traffic analysis there is no current client using model_info, so
we didn't add it to the feature set. We are working on an equivalent
solution in Lift Wing for all hosted models, not only revscoring ones, but
we don't have anything available now (a sort of "explainer" for the model's
metadata basically).

Luca

On Fri, Sep 22, 2023 at 6:01 PM Aaron Halfaker <aaron.halfa...@gmail.com>
wrote:

> It looks like model_info is not implemented at all.  E.g.
> https://ores-legacy.wikimedia.org/v3/scores/enwiki?model_info=statistics.thresholds.true.%22maximum+recall+@+precision+%3E=+0.9%22&models=damaging
>
> I get {"detail":{"error":{"code":"bad request","message":"model_info
> query parameter is not supported by this endpoint anymore. For more
> information please visit https://wikitech.wikimedia.org/wiki/ORES"}}}
>
> But when I go to that page, nothing discusses model_info.  Is there a way
> to get this from LiftWing?
>
> On Fri, Sep 22, 2023 at 8:53 AM Aaron Halfaker <aaron.halfa...@gmail.com>
> wrote:
>
>> Do you have a tag for filing bugs against ORES-legacy?  I can't seem to
>> find a relevant one in phab.
>>
>> On Fri, Sep 22, 2023 at 8:39 AM Luca Toscano <ltosc...@wikimedia.org>
>> wrote:
>>
>>> Hi Aaron!
>>>
>>> Thanks for following up. The API is almost compatible with what ORES
>>> currently does, but there are limitations (like the max number of revisions
>>> in a batch etc..). The API clearly states when something is not supported,
>>> so you can check its compatibility now making some requests to:
>>>
>>> https://ores-legacy.wikimedia.org
>>>
>>>  If you open a task with a list of systems that you need to migrate we
>>> can definitely take a look and help. So far the traffic being served by
>>> ORES has been reduced to few clients, and all of them don't run with
>>> recognizable UAs (see https://meta.wikimedia.org/wiki/User-Agent_policy)
>>> so we'll try our best to support them. The migration to Lift Wing has been
>>> widely publicized, a lot of documentation is available to migrate. We'd
>>> suggest trying Lift Wing for your systems instead (see
>>> https://wikitech.wikimedia.org/wiki/Machine_Learning/LiftWing/Usage).
>>>
>>> The Machine Learning plan is to eventually deprecate ores-legacy too, to
>>> maintain only one system (namely Lift Wing). There is no final date yet,
>>> we'll try to reach out to all remaining users first, so if you plan to keep
>>> using ores-legacy please follow up with us first :)
>>>
>>> Thanks!
>>>
>>> Luca (on behalf of the ML Team)
>>>
>>> On Fri, Sep 22, 2023 at 5:10 PM Aaron Halfaker <aaron.halfa...@gmail.com>
>>> wrote:
>>>
>>>> Does the new ores-legacy support the same feature set.  E.g. features
>>>> output, injection, and threshold optimizations.  Or is it just prediction?
>>>> This will affect some of the systems I need to migrate.
>>>>
>>>> On Fri, Sep 22, 2023, 06:21 Ilias Sarantopoulos <
>>>> isarantopou...@wikimedia.org> wrote:
>>>>
>>>>> Hello!
>>>>>
>>>>>
>>>>> As a next step in the deprecation process of ORES
>>>>> https://wikitech.wikimedia.org/wiki/ORES the Machine Learning team will
>>>>> switch the backend of ores.wikimedia.org to ores-legacy, a k8s
>>>>> application meant to provide a compatibility layer between ORES and Lift
>>>>> Wing so users that have not yet migrated to Lift Wing will be
>>>>> transparently migrated. Ores-legacy is an application that has the same 
>>>>> API
>>>>> as ORES but in the background makes requests to Lift Wing, allowing us to
>>>>> decommission the ORES servers until all clients have moved.
>>>>>
>>>>> This change is planned to take place on Monday 25th of September. If
>>>>> you have a client/application that is still using ORES we expect that this
>>>>> switch is going to be transparent for you.
>>>>>
>>>>> However keep in mind that ores-legacy is not a 100% replacement for
>>>>> ORES as some old and unused features are no longer supported.
>>>>>
>>>>> If you see anything out of the ordinary, feel free to contact the
>>>>> Machine Learning team:
>>>>>
>>>>> IRC libera: #wikimedia-ml
>>>>>
>>>>> Phabricator: Machine-Learning-team tag
>>>>>
>>>>> Thank you!
>>>>>
>>>>>
>>>>> On Wed, Aug 9, 2023 at 1:22 PM Chaloemphon Praphuchakang <
>>>>> yoshrakpra...@gmail.com> wrote:
>>>>>
>>>>>>
>>>>>> On Tue, 8 Aug 2023, 10:45 Tilman Bayer, <haebw...@gmail.com> wrote:
>>>>>>
>>>>>>>
>>>>>>> Hi Chris,
>>>>>>>
>>>>>>> On Mon, Aug 7, 2023 at 11:51 AM Chris Albon <cal...@wikimedia.org>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Tilman,
>>>>>>>>
>>>>>>>> Most of the work is still very experimental. We have hosted a few
>>>>>>>> LLMs on Lift Wing already (StarCoder for example) but they were just
>>>>>>>> running on CPU, far too slow for real use cases. But it proves that we 
>>>>>>>> can
>>>>>>>> easily host LLMs on Lift Wing. We have been pretty quiet about it 
>>>>>>>> while we
>>>>>>>> focus on the ORES migration, but it is our next big project. More soon
>>>>>>>> hopefully!
>>>>>>>>
>>>>>>> Understood. Looking forward to learning more later!
>>>>>>>
>>>>>>>
>>>>>>>> Where we are now is that we have budget for a big GPU purchase
>>>>>>>> (~10-20 GPUs depending on cost), the question we will try to answer 
>>>>>>>> after
>>>>>>>> the ORES migration is complete is: what GPUs should we purchase? We are
>>>>>>>> trying to balance our strong preference to stay open source (i.e. AMD 
>>>>>>>> mROC)
>>>>>>>> in a world dominated by a single closed source vendor (i.e. Nvidia). In
>>>>>>>> addition, do we go for a few expensive GPUs better suited to LLMs 
>>>>>>>> (A1000,
>>>>>>>> H100, etc) or a mix of big and small? We will need to figure out all 
>>>>>>>> this.
>>>>>>>>
>>>>>>> I see. On that matter, what do you folks make of the recent
>>>>>>> announcements of AMD's partnerships with Hugging Face and Pytorch[5]?
>>>>>>> (which, I understand, came after the ML team had already launched the
>>>>>>> aforementioned new AMD explorations)
>>>>>>>
>>>>>>> "Open-source AI: AMD looks to Hugging Face and Meta spinoff PyTorch
>>>>>>> to take on Nvidia [...]
>>>>>>> Both partnerships involve AMD’s ROCm AI software stack, the
>>>>>>> company’s answer to Nvidia’s proprietary CUDA platform and
>>>>>>> application-programming interface. AMD called ROCm an open and portable 
>>>>>>> AI
>>>>>>> system with out-of-the-box support that can port to existing AI models.
>>>>>>> [...B]oth AMD and Hugging Face are dedicating engineering resources to 
>>>>>>> each
>>>>>>> other and sharing data to ensure that the constantly updated AI models 
>>>>>>> from
>>>>>>> Hugging Face, which might not otherwise run well on AMD hardware, would 
>>>>>>> be
>>>>>>> “guaranteed” to work on hardware like the MI300X. [...] AMD said PyTorch
>>>>>>> will fully upstream the ROCm software stack and “provide immediate ‘day
>>>>>>> zero’ support for PyTorch 2.0 with ROCm release 5.4.2 on all AMD 
>>>>>>> Instinct
>>>>>>> accelerators,” which is meant to appeal to those customers looking to
>>>>>>> switch from Nvidia’s software ecosystem."
>>>>>>>
>>>>>>>
>>>>>>> In their own announcement, Hugging Face offered further details,
>>>>>>> including a pretty impressive list of models to be supported:[6]
>>>>>>>
>>>>>>>
>>>>>>> "We intend to support state-of-the-art transformer architectures for
>>>>>>> natural language processing, computer vision, and speech, such as BERT,
>>>>>>> DistilBERT, ROBERTA, Vision Transformer, CLIP, and Wav2Vec2. Of course,
>>>>>>> generative AI models will be available too (e.g., GPT2, GPT-NeoX, T5, 
>>>>>>> OPT,
>>>>>>> LLaMA), including our own BLOOM and StarCoder models. Lastly, we will 
>>>>>>> also
>>>>>>> support more traditional computer vision models, like ResNet and 
>>>>>>> ResNext,
>>>>>>> and deep learning recommendation models, a first for us. [..] We'll do 
>>>>>>> our
>>>>>>> best to test and validate these models for PyTorch, TensorFlow, and ONNX
>>>>>>> Runtime for the above platforms. [...] We will integrate the AMD ROCm 
>>>>>>> SDK
>>>>>>> seamlessly in our open-source libraries, starting with the transformers
>>>>>>> library."
>>>>>>>
>>>>>>>
>>>>>>> Do you think this may promise too much, or could it point to a
>>>>>>> possible solution of the Foundation's conundrum?
>>>>>>> In any case, this seems to be an interesting moment where many in AI
>>>>>>> are trying to move away from Nvidia's proprietary CUDA platform. Most of
>>>>>>> them probably more for financial and availability reasons though, given 
>>>>>>> the
>>>>>>> current GPU shortages[7] (which the ML team is undoubtedly aware of
>>>>>>> already; mentioning this as context for others on this list. See also
>>>>>>> Marketwatch's remarks about current margins[5]).
>>>>>>>
>>>>>>> Regards, Tilman
>>>>>>>
>>>>>>>
>>>>>>> [5]
>>>>>>> https://archive.ph/2023.06.15-173527/https://www.marketwatch.com/amp/story/open-source-ai-amd-looks-to-hugging-face-and-meta-spinoff-pytorch-to-take-on-nvidia-e4738f87
>>>>>>> [6] https://huggingface.co/blog/huggingface-and-amd
>>>>>>> [7] See e.g.
>>>>>>> https://gpus.llm-utils.org/nvidia-h100-gpus-supply-and-demand/
>>>>>>> (avoid playing the song though. Don't say I didn't warn you)
>>>>>>>
>>>>>>>
>>>>>>>> I wouldn't characterize WMF's Language Team using CPU as because of
>>>>>>>> AMD, rather at the time we didn't have the budget for GPUs so Lift Wing
>>>>>>>> didn't have any. Since then we have moved two GPUs onto Lift Wing for
>>>>>>>> testing but they are pretty old (2017ish). Once we make the big GPU
>>>>>>>> purchase Lift Wing will gain a lot of functionality for LLM and similar
>>>>>>>> models.
>>>>>>>>
>>>>>>>> Chris
>>>>>>>>
>>>>>>>> On Sun, Aug 6, 2023 at 9:57 PM Tilman Bayer <haebw...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> On Thu, Aug 3, 2023 at 7:16 AM Chris Albon <cal...@wikimedia.org>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi everybody,
>>>>>>>>>>
>>>>>>>>>> TL;DR We would like users of ORES models to migrate to our new
>>>>>>>>>> open source ML infrastructure, Lift Wing, within the next five 
>>>>>>>>>> months. We
>>>>>>>>>> are available to help you do that, from advice to making code 
>>>>>>>>>> commits. It
>>>>>>>>>> is important to note: All ML models currently accessible on ORES are 
>>>>>>>>>> also
>>>>>>>>>> currently accessible on Lift Wing.
>>>>>>>>>>
>>>>>>>>>> As part of the Machine Learning Modernization Project (
>>>>>>>>>> https://www.mediawiki.org/wiki/Machine_Learning/Modernization),
>>>>>>>>>> the Machine Learning team has deployed a Wikimedia’s new machine 
>>>>>>>>>> learning
>>>>>>>>>> inference infrastructure, called Lift Wing (
>>>>>>>>>> https://wikitech.wikimedia.org/wiki/Machine_Learning/LiftWing).
>>>>>>>>>> Lift Wing brings a lot of new features such as support for GPU-based
>>>>>>>>>> models, open source LLM hosting, auto-scaling, stability, and 
>>>>>>>>>> ability to
>>>>>>>>>> host a larger number of models.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> This sounds quite exciting! What's the best place to read up on
>>>>>>>>> that planned support for GPU-based models and open source LLMs? (I 
>>>>>>>>> also saw
>>>>>>>>> in the recent NYT article[1] that the team is "in the process of 
>>>>>>>>> adapting
>>>>>>>>> A.I. models that are 'off the shelf; — essentially models that have 
>>>>>>>>> been
>>>>>>>>> made available by researchers for anyone to freely customize — so that
>>>>>>>>> Wikipedia’s editors can use them for their work.")
>>>>>>>>>
>>>>>>>>> I'm aware of the history[2] of not being able to use NVIDIA
>>>>>>>>> GPUs due to their CUDA drivers being proprietary. It was mentioned 
>>>>>>>>> recently
>>>>>>>>> in the Wikimedia AI Telegram group that this is still a serious 
>>>>>>>>> limitation,
>>>>>>>>> despite some new explorations with AMD GPUs[3] - to the point that 
>>>>>>>>> e.g. the
>>>>>>>>> WMF's Language team has resorted to using models without GPU support 
>>>>>>>>> (CPU
>>>>>>>>> only).[4]
>>>>>>>>> It sounds like there is reasonable hope that this situation could
>>>>>>>>> change fairly soon? Would it also mean both at the same time, i.e. 
>>>>>>>>> open
>>>>>>>>> source LLMs running with GPU support (considering that at least some
>>>>>>>>> well-known ones appear to require torch.cuda.is_available() == True 
>>>>>>>>> for
>>>>>>>>> that)?
>>>>>>>>>
>>>>>>>>> Regards, Tilman
>>>>>>>>>
>>>>>>>>> [1]
>>>>>>>>> https://www.nytimes.com/2023/07/18/magazine/wikipedia-ai-chatgpt.html
>>>>>>>>> [2]
>>>>>>>>> https://techblog.wikimedia.org/2020/04/06/saying-no-to-proprietary-code-in-production-is-hard-work-the-gpu-chapter/
>>>>>>>>> [3] https://phabricator.wikimedia.org/T334583 etc.
>>>>>>>>> [4]
>>>>>>>>> https://diff.wikimedia.org/2023/06/13/mint-supporting-underserved-languages-with-open-machine-translation/
>>>>>>>>> or https://thottingal.in/blog/2023/07/21/wikiqa/ (experimental
>>>>>>>>> but, I understand, written to be deployable on WMF infrastructure)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> With the creation of Lift Wing, the team is turning its attention
>>>>>>>>>> to deprecating the current machine learning infrastructure, ORES. 
>>>>>>>>>> ORES
>>>>>>>>>> served us really well over the years, it was a successful project 
>>>>>>>>>> but it
>>>>>>>>>> came before radical changes in technology like Docker, Kubernetes 
>>>>>>>>>> and more
>>>>>>>>>> recently MLOps. The servers that run ORES are at the end of their 
>>>>>>>>>> planned
>>>>>>>>>> lifespan and so to save cost we are going to shut them down in early 
>>>>>>>>>> 2024.
>>>>>>>>>>
>>>>>>>>>> We have outlined a deprecation path on Wikitech (
>>>>>>>>>> https://wikitech.wikimedia.org/wiki/ORES), please read the page
>>>>>>>>>> if you are a maintainer of a tool or code that uses the ORES endpoint
>>>>>>>>>> https://ores.wikimedia.org/). If you have any doubt or if you
>>>>>>>>>> need assistance in migrating to Lift Wing, feel free to contact the 
>>>>>>>>>> ML team
>>>>>>>>>> via:
>>>>>>>>>>
>>>>>>>>>> - Email: m...@wikimedia.org
>>>>>>>>>> - Phabricator: #Machine-Learning-Team tag
>>>>>>>>>> - IRC (Libera): #wikimedia-ml
>>>>>>>>>>
>>>>>>>>>> The Machine Learning team is available to help projects migrate,
>>>>>>>>>> from offering advice to making code commits. We want to make this as 
>>>>>>>>>> easy
>>>>>>>>>> as possible for folks.
>>>>>>>>>>
>>>>>>>>>> High Level timeline:
>>>>>>>>>>
>>>>>>>>>> **By September 30th 2023: *Infrastructure powering the ORES API
>>>>>>>>>> endpoint will be migrated from ORES to Lift Wing. For users, the API
>>>>>>>>>> endpoint will remain the same, and most users won’t notice any 
>>>>>>>>>> change.
>>>>>>>>>> Rather just the backend services powering the endpoint will change.
>>>>>>>>>>
>>>>>>>>>> Details: We'd like to add a DNS CNAME that points
>>>>>>>>>> ores.wikimedia.org to ores-legacy.wikimedia.org, a new endpoint
>>>>>>>>>> that offers a almost complete replacement of the ORES API calling 
>>>>>>>>>> Lift Wing
>>>>>>>>>> behind the scenes. In an ideal world we'd migrate all tools to Lift 
>>>>>>>>>> Wing
>>>>>>>>>> before decommissioning the infrastructure behind
>>>>>>>>>> ores.wikimedia.org, but it turned out to be really challenging
>>>>>>>>>> so to avoid disrupting users we chose to implement a transition 
>>>>>>>>>> layer/API.
>>>>>>>>>>
>>>>>>>>>> To summarize, if you don't have time to migrate before September
>>>>>>>>>> to Lift Wing, your code/tool should work just fine on
>>>>>>>>>> ores-legacy.wikimedia.org and you'll not have to change a line
>>>>>>>>>> in your code thanks to the DNS CNAME. The ores-legacy endpoint is 
>>>>>>>>>> not a
>>>>>>>>>> 100% replacement for ores, we removed some very old and not used 
>>>>>>>>>> features,
>>>>>>>>>> so we highly recommend at least test the new endpoint for your use 
>>>>>>>>>> case to
>>>>>>>>>> avoid surprises when we'll make the switch. In case you find anything
>>>>>>>>>> weird, please report it to us using the aforementioned channels.
>>>>>>>>>>
>>>>>>>>>> **September to January: *We will be reaching out to every user
>>>>>>>>>> of ORES we can identify and working with them to make the migration 
>>>>>>>>>> process
>>>>>>>>>> as easy as possible.
>>>>>>>>>>
>>>>>>>>>> **By January 2024: *If all goes well, we would like zero traffic
>>>>>>>>>> on the ORES API endpoint so we can turn off the ores-legacy API.
>>>>>>>>>>
>>>>>>>>>> If you want more information about Lift Wing, please check
>>>>>>>>>> https://wikitech.wikimedia.org/wiki/Machine_Learning/LiftWing
>>>>>>>>>>
>>>>>>>>>> Thanks in advance for the patience and the help!
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>>
>>>>>>>>>> The Machine Learning Team
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
>>>>>>>>>> To unsubscribe send an email to
>>>>>>>>>> wikitech-l-le...@lists.wikimedia.org
>>>>>>>>>>
>>>>>>>>>> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
>>>>>>>>> To unsubscribe send an email to
>>>>>>>>> wikitech-l-le...@lists.wikimedia.org
>>>>>>>>>
>>>>>>>>> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
>>>>>>>> To unsubscribe send an email to
>>>>>>>> wikitech-l-le...@lists.wikimedia.org
>>>>>>>>
>>>>>>>> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
>>>>>>> To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
>>>>>>>
>>>>>>> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
>>>>>>
>>>>>> _______________________________________________
>>>>>> Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
>>>>>> To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
>>>>>>
>>>>>> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
>>>>>
>>>>> _______________________________________________
>>>>> Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
>>>>> To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
>>>>>
>>>>> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
>>>>
>>>> _______________________________________________
>>>> Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
>>>> To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
>>>>
>>>> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
>>>
>>> _______________________________________________
>>> Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
>>> To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
>>>
>>> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
>>
>> _______________________________________________
> Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
> To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/

_______________________________________________
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/

[Wikitech-l] Re: ORES To Lift Wing Migration

Reply via email to