[ 
https://issues.apache.org/jira/browse/YARN-9120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16721027#comment-16721027
 ] 

Zhankun Tang edited comment on YARN-9120 at 12/14/18 7:46 AM:
--------------------------------------------------------------

[~snemeth] , Thanks for the explanation! Agree that it's valuable to enable 
updating of the available GPU devices at runtime. Three questions:
 # Could you share the new property value of 
"yarn.nodemanager.resource-plugins.gpu.allowed-gpu-devices"?
 # Per my understanding, YARN NM cannot reload configuration at runtime for 
now. I'm afraid the newly added value cannot work that perfectly. And 
personally, restarting NM is not a big thing.
 # I'm also thinking in what scenario would a user wants a diverse 
configuration in the cluster. As far as I know, the Ambari can only update 
configuration at the cluster level. Maybe this is useful for a manually managed 
Hadoop cluster on non-uniform servers


was (Author: tangzhankun):
[~snemeth] , Thanks for the explanation! Agree that it's valuable to enable 
updating of the available GPU devices at runtime. Three questions:
 # Could you share the new property value of 
"yarn.nodemanager.resource-plugins.gpu.allowed-gpu-devices"?
 # Per my understanding, YARN NM cannot reload configuration at runtime for 
now. I'm afraid the newly added value cannot work that perfectly. And 
personally, restarting NM is not a big thing.
 # I'm also thinking in what scenario would a user wants a diverse 
configuration in the cluster. As far as I know, the Ambari can only update 
configuration at the cluster level. Maybe this is useful for a manually managed 
Hadoop cluster.

> Need to have a way to turn off GPU auto-discovery in GpuDiscoverer
> ------------------------------------------------------------------
>
>                 Key: YARN-9120
>                 URL: https://issues.apache.org/jira/browse/YARN-9120
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Szilard Nemeth
>            Assignee: Szilard Nemeth
>            Priority: Major
>
> GpuDiscoverer.getGpusUsableByYarn either parses the user-defined GPU devices 
> or should have the value 'auto' (from property: 
> yarn.nodemanager.resource-plugins.gpu.allowed-gpu-devices)
> In some circumstances, users would want to exclude a node from scheduling, so 
> they should have an option to turn off auto-discovery.
> It's straightforward that this is possible by removing the GPU 
> resource-plugin from YARN's config along with GPU-related config in 
> container-executor.cfg, but doing that with a dedicated value for 
> yarn.nodemanager.resource-plugins.gpu.allowed-gpu-devices is a more 
> lightweight approach.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to