[jira] [Commented] (YARN-9138) Test error handling of nvidia-smi binary execution of GpuDiscoverer

Adam Antal (JIRA) Thu, 21 Feb 2019 10:46:37 -0800


    [ 
https://issues.apache.org/jira/browse/YARN-9138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16774399#comment-16774399
 ]


Adam Antal commented on YARN-9138:
----------------------------------

Hi [~snemeth], thanks for the patch!

I have some minor comment regarding it.
 * In your patch you create fakeBinaries a lot. Creation is quite similar in 
every case, can they be extracted into a separate function? It'd make your code 
more readable and compact.
 * I do not really like these few lines:
{noformat}
Shell.execCommand(Shell.getSetPermissionCommand(EXEC_PERMISSION, false,
    fakeBinary.getAbsolutePath()));
{noformat}
Since we only mimic the behaviour of calling commands like this, can we replace 
it with mock? Or rather just can we place the creation of the file there?
 What I was specifically thinking about is that you create a bash file, and 
manually run it, and it outputs an another file. This does not test the 
behaviour of GpuDiscoverer, just add a plus layer of complexity to the test. If 
GpuDiscoverer had a function like this, we could call that, but otherwise I 
think instead of writing out a bash file which creates another file, It would 
be much easier to simply create the file itself in one step. What do you think 
about my idea? If you'd argue that the ability of running the shell command is 
important as well, than I'd test it here, in this file, but in a separate test 
case.

 * The test cases (not only yours) lack of logging, also there are comments, 
for e.g. in {{testGetGpuDeviceInformationFaultyNvidiaSmiScriptConsecutiveRun}} 
that you added like:
{noformat}
//make sure to query nvidia-smi correctly, once
{noformat}
which would look better if it were logged out in debug log. Would you mind 
moving then into a debug log? (It'd make the code more self-documented through 
the logging.) If you may think that it is not relevant from the point of this 
original issue, we can handle it in a subtask under YARN-9304, with your 
approval.

(For that issue I'd also have some more comments: - but that's no concerning 
your patch if you decide so.
 - Instead of getting the test folder from {{getTestParentFolder}}, we can move 
it to a static final String variable?
 - Is {{testLinuxGpuResourceDiscoverPluginConfig}} ever get executed as the 
system property RunLinuxGpuResourceDiscoverPluginConfigTest is not set in other 
place of the code?
 - Also that cases from {{testLinuxGpuResourceDiscoverPluginConfig}} can be 
seperated into different tests.
 - {{getNumberOfUsableGpusFromConfig}} should be seperated into two seperate 
tests as well, testing the "illegal" and "valid" formats.)

> Test error handling of nvidia-smi binary execution of GpuDiscoverer
> -------------------------------------------------------------------
>
>                 Key: YARN-9138
>                 URL: https://issues.apache.org/jira/browse/YARN-9138
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Szilard Nemeth
>            Assignee: Szilard Nemeth
>            Priority: Major
>         Attachments: YARN-9138.001.patch, YARN-9138.002.patch, 
> YARN-9138.003.patch
>
>
> The code that executes nvidia-smi (doing GPU device auto-discovery) don't 
> have much test coverage.
> This patch adds tests to this part of the code.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (YARN-9138) Test error handling of nvidia-smi binary execution of GpuDiscoverer

Reply via email to