[ 
https://issues.apache.org/jira/browse/YARN-9118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16756544#comment-16756544
 ] 

Szilard Nemeth edited comment on YARN-9118 at 1/30/19 8:48 PM:
---------------------------------------------------------------

Hi [~pbacsko]!
 Thanks for your review comments once again!

When lastDiscoveredGpuInformation.getGpus() is null, the return value of 
{{parseGpuDevicesFromAutoDiscoveredGpuInfo()}} will be an empty-list since no 
GPU devices are found. 
 The error is handled elsewhere, see the 2 callers of 
{{GpuDiscoverer.getInstance().getGpusUsableByYarn()}}.
 Anyway, I added a debug log here as well to indicate that no GPU was found by 
the auto-discovery binary, so this case can obviously happen.

Please note that the latest patch contains some additional cleanup, but very 
minor ones.


was (Author: snemeth):
Hi [~pbacsko]!
Thanks for your review comments once again!

When lastDiscoveredGpuInformation.getGpus() is null, the return value of 
{{parseGpuDevicesFromAutoDiscoveredGpuInfo()}} will be an empty-list since no 
GPU devices are found. 
The error is handled elsewhere, see the 2 callers of 
{{GpuDiscoverer.getInstance().getGpusUsableByYarn()}}.
Anyway, I added a debug log here as well to indicate that no GPU was found.

Please note that the latest patch contains some additional cleanup, but very 
minor ones.

> Handle issues with parsing user defined GPU devices in GpuDiscoverer
> --------------------------------------------------------------------
>
>                 Key: YARN-9118
>                 URL: https://issues.apache.org/jira/browse/YARN-9118
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Szilard Nemeth
>            Assignee: Szilard Nemeth
>            Priority: Major
>         Attachments: YARN-9118.001.patch, YARN-9118.002.patch, 
> YARN-9118.003.patch, YARN-9118.004.patch
>
>
> getGpusUsableByYarn has the following issues: 
> - Duplicate GPU device definitions are not denied: This seems to be the 
> biggest issue as it could increase the number of devices on the node if the 
> device ID is defined 2 or more times.
> - An empty-string is accepted, it works like the user would not want to use 
> auto-discovery and haven't defined any GPU devices: This will result in an 
> empty device list, but the empty-string check is never explicitly there in 
> the code, so this behavior just coincidental.
> - Number validation does not happen on GPU device IDs (separated by commas)
> Many testcases are added as the coverage was already very low.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to