[jira] [Commented] (YARN-10988) Spark application stuck at ACCEPTED state at spark-submit

unical1988 (Jira) Mon, 25 Oct 2021 09:56:20 -0700


    [ 
https://issues.apache.org/jira/browse/YARN-10988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17433874#comment-17433874
 ]


unical1988 commented on YARN-10988:
-----------------------------------

actually i tried that and didn't work (neither for vcores or memory) but i 
further went to check the log of the slave from UI of the master and it 
actually returned something like :

 
{noformat}
2021-10-25 12:22:53,788 INFO cluster.YarnClusterScheduler: Created 
YarnClusterScheduler 2021-10-25 12:22:53,960 INFO util.Utils: Successfully 
started service 'org.apache.spark.network.netty.NettyBlockTransferService' on 
port 57482. 2021-10-25 12:22:53,960 INFO netty.NettyBlockTransferService: 
Server created on slaveVM1:57482 2021-10-25 12:22:53,960 INFO 
storage.BlockManager: Using 
org.apache.spark.storage.RandomBlockReplicationPolicy for block replication 
policy 2021-10-25 12:22:53,976 INFO storage.BlockManagerMaster: Registering 
BlockManager BlockManagerId(driver, slaveVM1, 57482, None) 2021-10-25 
12:22:53,976 INFO storage.BlockManagerMasterEndpoint: Registering block manager 
slaveVM1:57482 with 366.3 MiB RAM, BlockManagerId(driver, slaveVM1, 57482, 
None) 2021-10-25 12:22:53,976 INFO storage.BlockManagerMaster: Registered 
BlockManager BlockManagerId(driver, slaveVM1, 57482, None) 2021-10-25 
12:22:53,976 INFO storage.BlockManager: Initialized BlockManager: 
BlockManagerId(driver, slaveVM1, 57482, None) 2021-10-25 12:22:54,194 INFO 
ui.ServerInfo: Adding filter to /metrics/json: 
org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 2021-10-25 
12:22:54,194 INFO handler.ContextHandler: Started 
o.s.j.s.ServletContextHandler@7fe6122a{/metrics/json,null,AVAILABLE,@Spark} 
2021-10-25 12:22:54,288 INFO client.RMProxy: Connecting to ResourceManager at 
/0.0.0.0:8030 2021-10-25 12:22:54,366 INFO yarn.YarnRMClient: Registering the 
ApplicationMaster 2021-10-25 12:22:56,433 INFO ipc.Client: Retrying connect to 
server: 0.0.0.0/0.0.0.0:8030. Already tried 0 time(s); retry policy is 
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 
2021-10-25 12:22:58,467 INFO ipc.Client: Retrying connect to server: 
0.0.0.0/0.0.0.0:8030. Already tried 1 time(s); retry policy is 
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 
2021-10-25 12:23:00,502 INFO ipc.Client: Retrying connect to server: 
0.0.0.0/0.0.0.0:8030. Already tried 2 time(s); retry policy is 
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-10-25 14:32:25,915 INFO retry.RetryInvocationHandler: 
java.net.ConnectException: Your endpoint configuration is wrong; For more 
details see: http://wiki.apache.org/hadoop/UnsetHostnameOrPort, while invoking 
ApplicationMasterProtocolPBClientImpl.registerApplicationMaster over null after 
5 failover attempts. Trying to failover after sleeping for 27785ms.
{noformat}
 

 

What does that mean ?

PS : i set this in yarn-site.xml in the nodemanager to tell where is the 
resource manager to the slave : 

 

 

{{}}
{code:java}
<property>
   <name>yarn.resourcemanager.hostname</name>
   <value>masterVM2</value>
</property>
{code}
{{}}

> Spark application stuck at ACCEPTED state at spark-submit
> ---------------------------------------------------------
>
>                 Key: YARN-10988
>                 URL: https://issues.apache.org/jira/browse/YARN-10988
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: applications
>    Affects Versions: 3.2.1
>            Reporter: unical1988
>            Priority: Major
>
> Hello,
>  
> I have configured & set Hadoop Cluster over 2 nodes and launch it along with 
> Yarn like so : 
>  
> *_On the master node :_* 
>  * hdfs namenode -regular
>  * yarn resourcemanager
>  
> *_On the slave node :_* 
>  * hdfs datanode -regular
>  * yarn nodemanager
> And it shows through UI that there has been a connection established between 
> the two machines that form the cluster.
> To note that *_start-dfs_* on the master node started both namenode and 
> datanode even after setting *_slaves_* and *_hosts_* files.
> Now i submit an application (simple _hello world_) to _Yarn_ : through this 
> command :
> *Spark-submit --class "main" --master yarn pathToJar*
>  
> But i get the error 
> 15/08/29 12:07:58 INFO Client: ApplicationManager is waiting for the 
> ResourceManager 
> client token: N/A diagnostics: N/A
> ApplicationMaster host: N/A
> ApplicationMaster RPC port: -1
> queue: root.hdfs
> start time: 1440864477580
> final status: UNDEFINED tracking URL: 
> http://chd2.moneyball.guru:8088/proxy/application_1440861466017_0007/ user: 
> hdfs 15/08/29 12:07:59 INFO Client: Application report for 
> application_1440861466017_0007 (state: ACCEPTED) 15/08/29 12:08:00 INFO 
> Client: Application report for application_1440861466017_0007 (state: 
> ACCEPTED) 15/08/29 12:08:01 INFO Client: Application report for 
> application_1440861466017_0007 (state: ACCEPTED)...
>  
> What am i missing in my configuration ?
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (YARN-10988) Spark application stuck at ACCEPTED state at spark-submit

Reply via email to