[ 
https://issues.apache.org/jira/browse/YARN-10651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17290636#comment-17290636
 ] 

Haibo Chen commented on YARN-10651:
-----------------------------------

Relevant RM log

 
{code:java}
6553854:2021-02-24 17:06:33,934 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
6553856:2021-02-24 17:06:33,935 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: 
xxx.linkedin.com:8041 Node Transitioned from RUNNING to UNHEALTHY

6667464:2021-02-24 17:06:43,316 INFO 
org.apache.hadoop.yarn.server.resourcemanager.NodesListManager: Gracefully 
decommission node xxx.linkedin.com:8041 with state UNHEALTHY

6667894:2021-02-24 17:06:43,344 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Put Node 
xxx.linkedin.com:8041 in DECOMMISSIONING.
6667896:2021-02-24 17:06:43,344 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: 
xxx.linkedin.com:8041 Node Transitioned from UNHEALTHY to DECOMMISSIONING
 
6674223:2021-02-24 17:06:44,019 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
6685460:2021-02-24 17:06:45,021 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
6694638:2021-02-24 17:06:46,021 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
6708206:2021-02-24 17:06:46,482 INFO 
org.apache.hadoop.yarn.server.resourcemanager.NodesListManager: No action for 
node xxx.linkedin.com:8041 with state DECOMMISSIONING
6713019:2021-02-24 17:06:47,064 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
6722017:2021-02-24 17:06:48,022 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
6731628:2021-02-24 17:06:49,024 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
6743847:2021-02-24 17:06:50,063 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
6753586:2021-02-24 17:06:51,026 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
6762950:2021-02-24 17:06:52,028 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
6772642:2021-02-24 17:06:53,081 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
6781739:2021-02-24 17:06:54,033 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
6791184:2021-02-24 17:06:55,036 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
6800168:2021-02-24 17:06:56,034 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
6810070:2021-02-24 17:06:57,035 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
6818683:2021-02-24 17:06:58,035 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
6828506:2021-02-24 17:06:59,036 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
6839367:2021-02-24 17:07:00,036 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
6850881:2021-02-24 17:07:01,037 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
6861549:2021-02-24 17:07:02,039 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
6872306:2021-02-24 17:07:03,056 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
6884036:2021-02-24 17:07:04,098 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
6894894:2021-02-24 17:07:05,101 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
6904638:2021-02-24 17:07:06,101 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
6914225:2021-02-24 17:07:07,101 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
6929980:2021-02-24 17:07:08,112 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
6948455:2021-02-24 17:07:09,102 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
6959772:2021-02-24 17:07:10,104 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
6971261:2021-02-24 17:07:11,104 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
6983585:2021-02-24 17:07:12,104 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
6994753:2021-02-24 17:07:13,104 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
7005445:2021-02-24 17:07:14,104 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
7016624:2021-02-24 17:07:15,104 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
7029087:2021-02-24 17:07:16,105 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
7041839:2021-02-24 17:07:17,114 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
7042252:2021-02-24 17:07:17,145 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
 Removed node xxx.linkedin.com:8041 clusterResource: <memory:2315087872, 
vCores:647710>
7050888:2021-02-24 17:07:17,733 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
7062152:2021-02-24 17:07:18,592 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
7065757:2021-02-24 17:07:18,902 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
7071889:2021-02-24 17:07:19,371 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
7072317:2021-02-24 17:07:19,406 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
7075667:2021-02-24 17:07:19,656 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
7078668:2021-02-24 17:07:19,907 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
7079262:2021-02-24 17:07:19,958 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
7080710:2021-02-24 17:07:20,084 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
7082508:2021-02-24 17:07:20,278 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
7089971:2021-02-24 17:07:20,778 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
7092388:2021-02-24 17:07:20,912 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
7102442:2021-02-24 17:07:21,734 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
7114518:2021-02-24 17:07:22,767 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
7124922:2021-02-24 17:07:23,746 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
7136328:2021-02-24 17:07:24,783 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
7148549:2021-02-24 17:07:25,740 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
7160566:2021-02-24 17:07:26,737 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
7172563:2021-02-24 17:07:27,778 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
7187126:2021-02-24 17:07:28,750 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
7202876:2021-02-24 17:07:29,748 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
7216621:2021-02-24 17:07:30,743 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
7229029:2021-02-24 17:07:31,739 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
7248473:2021-02-24 17:07:32,740 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
7273930:2021-02-24 17:07:33,806 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
7285734:2021-02-24 17:07:34,789 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
7298188:2021-02-24 17:07:35,789 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
7316815:2021-02-24 17:07:36,789 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
7329535:2021-02-24 17:07:37,794 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
7343201:2021-02-24 17:07:38,787 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS
7358260:2021-02-24 17:07:39,787 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node 
xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - 
modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM 
PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL 
PASS,modules.CGROUP PASS

7358512-2021-02-24 17:07:39,798 FATAL 
org.apache.hadoop.yarn.event.EventDispatcher: Error in handling event type 
NODE_RESOURCE_UPDATE to the Event Dispatcher
7358513-java.lang.NullPointerException
7358514- at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.updateNodeResource(AbstractYarnScheduler.java:809)
7358515- at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updateNodeAndQueueResource(CapacityScheduler.java:1116)
7358516- at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1505)
7358517- at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:154)
7358518- at 
org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66)
7358519- at java.lang.Thread.run(Thread.java:748)
7358520:2021-02-24 17:07:39,798 INFO 
org.apache.hadoop.yarn.event.EventDispatcher: Exiting, bbye..
{code}
 

> CapacityScheduler crashed with NPE in 
> AbstractYarnScheduler.updateNodeResource() 
> ---------------------------------------------------------------------------------
>
>                 Key: YARN-10651
>                 URL: https://issues.apache.org/jira/browse/YARN-10651
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Haibo Chen
>            Assignee: Haibo Chen
>            Priority: Major
>
> {code:java}
> 2021-02-24 17:07:39,798 FATAL org.apache.hadoop.yarn.event.EventDispatcher: 
> Error in handling event type NODE_RESOURCE_UPDATE to the Event Dispatcher
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.updateNodeResource(AbstractYarnScheduler.java:809)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updateNodeAndQueueResource(CapacityScheduler.java:1116)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1505)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:154)
> at 
> org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66)
> at java.lang.Thread.run(Thread.java:748)
> 2021-02-24 17:07:39,798 INFO org.apache.hadoop.yarn.event.EventDispatcher: 
> Exiting, bbye..{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to