[
https://issues.apache.org/jira/browse/YARN-9138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16777798#comment-16777798
]
Adam Antal commented on YARN-9138:
----------------------------------
Thanks for your responses [~snemeth], the patch is pretty good now.
* As of point 2, I'm still a bit concerned. I mean what happens if in
GpuDiscoverer someone gets rid of that shell call? (for e.g. because they
introduced a fancy API that would be easier to use). IMO it should be coupled
to that. I don't want to keep pushing it, but I think the best solution would
be to introduce a new function in GpuDiscoverer containing that shell command,
with @ VisibleForTesting annotation. That would essentially tie the usage of
shell to that. Can you see my point? As I said, I'm fine without it, but I see
a potential hazard in that.
* Thank you for extending the logging, but I still see a few more comments
which could be replaced by debug logging:
Specifically thinking about these:
{noformat}
//replace script with faulty one
{noformat}
{noformat}
//verify if GPUs are still hold the value of first successful query
{noformat}
Also the test case 1-2-3 could be inserted into debug logging as well:
{noformat}
// test case 1, check default setting.
{noformat}
* As of that system property I haven't thought of the usage by jenkins, so I'd
rather keep that as it is.
* As of test separations, I think it is ok, I would not transform those
further.
> Test error handling of nvidia-smi binary execution of GpuDiscoverer
> -------------------------------------------------------------------
>
> Key: YARN-9138
> URL: https://issues.apache.org/jira/browse/YARN-9138
> Project: Hadoop YARN
> Issue Type: Improvement
> Reporter: Szilard Nemeth
> Assignee: Szilard Nemeth
> Priority: Major
> Attachments: YARN-9138.001.patch, YARN-9138.002.patch,
> YARN-9138.003.patch, YARN-9138.004.patch
>
>
> The code that executes nvidia-smi (doing GPU device auto-discovery) don't
> have much test coverage.
> This patch adds tests to this part of the code.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]