Hi folks,

Starting a thread to discuss merging YARN-8200 (resource profiles/GPU
support) to branch-2.

For resource types, we have ported YARN-4081~YARN-7137 (as part of
YARN-3926 umbrella).
For GPU support, we have ported the native non-docker GPU support related
items in YARN-6223.
For both of these, we have also ported miscellaneous fixes for issues we
encountered internally.

Some potential issues I see are, some of the resource types commits did not
make it to branch-3.0. Also most of the GPU-specific commits did not make
it to branch-3.0 either.

We have deployed these two features internally on top of a branch-2.9 fork
on a 100 node GPU cluster which is running deep learning workloads, and it
is working well.

Before the holidays/after new years we will work on cleaning up the feature
branch (YARN-8200), e.g. filing tickets on branch-2 specific bug fixes,
rebasing on latest branch-2, syncing any bug fixes in our internal fork
which did not make it to the feature branch, etc. Assuming no objections,
once it's ready we will start a vote to merge.

Thanks,
Jonathan Hung

Reply via email to