[ https://issues.apache.org/jira/browse/YARN-9118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16756544#comment-16756544 ]
Szilard Nemeth edited comment on YARN-9118 at 1/30/19 8:48 PM: --------------------------------------------------------------- Hi [~pbacsko]! Thanks for your review comments once again! When lastDiscoveredGpuInformation.getGpus() is null, the return value of {{parseGpuDevicesFromAutoDiscoveredGpuInfo()}} will be an empty-list since no GPU devices are found. The error is handled elsewhere, see the 2 callers of {{GpuDiscoverer.getInstance().getGpusUsableByYarn()}}. Anyway, I added a debug log here as well to indicate that no GPU was found by the auto-discovery binary, so this case can obviously happen. Please note that the latest patch contains some additional cleanup, but very minor ones. was (Author: snemeth): Hi [~pbacsko]! Thanks for your review comments once again! When lastDiscoveredGpuInformation.getGpus() is null, the return value of {{parseGpuDevicesFromAutoDiscoveredGpuInfo()}} will be an empty-list since no GPU devices are found. The error is handled elsewhere, see the 2 callers of {{GpuDiscoverer.getInstance().getGpusUsableByYarn()}}. Anyway, I added a debug log here as well to indicate that no GPU was found. Please note that the latest patch contains some additional cleanup, but very minor ones. > Handle issues with parsing user defined GPU devices in GpuDiscoverer > -------------------------------------------------------------------- > > Key: YARN-9118 > URL: https://issues.apache.org/jira/browse/YARN-9118 > Project: Hadoop YARN > Issue Type: Bug > Reporter: Szilard Nemeth > Assignee: Szilard Nemeth > Priority: Major > Attachments: YARN-9118.001.patch, YARN-9118.002.patch, > YARN-9118.003.patch, YARN-9118.004.patch > > > getGpusUsableByYarn has the following issues: > - Duplicate GPU device definitions are not denied: This seems to be the > biggest issue as it could increase the number of devices on the node if the > device ID is defined 2 or more times. > - An empty-string is accepted, it works like the user would not want to use > auto-discovery and haven't defined any GPU devices: This will result in an > empty device list, but the empty-string check is never explicitly there in > the code, so this behavior just coincidental. > - Number validation does not happen on GPU device IDs (separated by commas) > Many testcases are added as the coverage was already very low. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org