[
https://issues.apache.org/jira/browse/YARN-5983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988143#comment-15988143
]
Zhankun Tang commented on YARN-5983:
------------------------------------
[~wangda], Thanks for the review.
Yes, quite agree that YARN-3409 is a supplement to YARN-3926 for better RM
scheduling(scheduling preference) but not related to consumable resource.
Sorry for my misleading "exclusive" concept. Regarding to
exclusive/non-exclusive resource, maybe *exclusive/non-exclusive consumable
resource* is better?
IMO, the consumable resource can be classified into two categories. One type is
"pooled resource" like CPU, memory, bandwith and blkio which are abstracted by
the OS. Whereas the other type is "not-easy/possible-to-share resource" like
GPU, FPGA, SSD disk or network port. These resources are going to be
first-class citizen but needs more assist in YARN to be exposed to
applications.
For the unclear sentences in design doc:
{quote}
No NM side resource management of FPGA resource. For instance, dynamically
resource discovery, monitoring and preparation before container launch
{quote}
-> The NM should be able to discover the FPGA devices automatically and save
their attributes to some storage. Given an allocated container, NM should
download the requested IP and flash it on to the scheduled device. NM also
needs to do health check on the FPGA device and the process that is using it.
{quote}
AM set the IP UUID/name in container environment and sends requests to NM to
launch the allocated containers.
{quote}
-> As mentioned in our offline meeting, the user has to provide the desired IP
ID to be flashed on the FPGA device to AM and then set it into environment.
> [Umbrella] Support for FPGA as a Resource in YARN
> -------------------------------------------------
>
> Key: YARN-5983
> URL: https://issues.apache.org/jira/browse/YARN-5983
> Project: Hadoop YARN
> Issue Type: New Feature
> Components: yarn
> Reporter: Zhankun Tang
> Assignee: Zhankun Tang
> Attachments: YARN-5983-Support-FPGA-resource-on-NM-side_v1.pdf
>
>
> As various big data workload running on YARN, CPU will no longer scale
> eventually and heterogeneous systems will become more important. ML/DL is a
> rising star in recent years, applications focused on these areas have to
> utilize GPU or FPGA to boost performance. Also, hardware vendors such as
> Intel also invest in such hardware. It is most likely that FPGA will become
> popular in data centers like CPU in the near future.
> So YARN as a resource managing and scheduling system, would be great to
> evolve to support this. This JIRA proposes FPGA to be a first-class citizen.
> The changes roughly includes:
> 1. FPGA resource detection and heartbeat
> 2. Scheduler changes
> 3. FPGA related preparation and isolation before launch container
> We know that YARN-3926 is trying to extend current resource model. But still
> we can leave some FPGA related discussion here
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]