[
https://issues.apache.org/jira/browse/YARN-6620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16165119#comment-16165119
]
Devaraj K commented on YARN-6620:
---------------------------------
Thanks [~leftnoteasy] for the responses.
bq. My understanding of JAXBContext is mostly used when we need to convert
between object and XML/JSON. Since output of nvidia-smi is a customized XML
format, which doesn't follow JAXB standard. Is it still best practice to use
JAXBContext under such use case? For example, FairScheduler parses XML file
directly: AllocationFileLoaderService#reloadAllocations.
JAXBContext can be used for any XML format, doesn't have to be in any specific
format, I could see that the sample format in the patch can be converted to a
Java Object ,so that we can eliminate the traversing and parsing logic in
GpuDeviceInformationParser.java.
bq. I considered this option before, unless there's strong need for this to run
different command or call Nvidia native APIs directly, I would prefer to hard
code to use nvidia-smi instead of introducing another abstraction layer. I'm
open to do refactoring to support this case once we have such requirements.
I think it would be useful if users have sym links created with different names
than the hard coded name. I feel we don't have to add a new configuration for
the executable instead we can have the binary name also as part of
DEFAULT_NM_GPU_PATH_TO_EXEC and users can provide the path with the executable
name for the configuration 'yarn.nodemanager.resource.gpu.path-to-executables'.
> [YARN-6223] NM Java side code changes to support isolate GPU devices by using
> CGroups
> -------------------------------------------------------------------------------------
>
> Key: YARN-6620
> URL: https://issues.apache.org/jira/browse/YARN-6620
> Project: Hadoop YARN
> Issue Type: Sub-task
> Reporter: Wangda Tan
> Assignee: Wangda Tan
> Attachments: YARN-6620.001.patch, YARN-6620.002.patch,
> YARN-6620.003.patch, YARN-6620.004.patch, YARN-6620.005.patch
>
>
> This JIRA plan to add support of:
> 1) GPU configuration for NodeManagers
> 2) Isolation in CGroups. (Java side).
> 3) NM restart and recovery allocated GPU devices
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]