Here's the exception whenever the applicationMaster is one of the slaves
(cluster mode) : (also increasing spark tries or yarn tries didn't help)

2021-11-12 17:20:37,301 ERROR yarn.ApplicationMaster: Uncaught exception:
org.apache.spark.SparkException: Exception thrown in awaitResult:
        at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:301)
        at 
org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:504)
        at 
org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:268)
        at 
org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:899)
        at 
org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:898)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
        at 
org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:898)
        at 
org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
Caused by: java.net.BindException: Cannot assign requested address:
bind: Service 'sparkDriver' failed after 16 retries (on a random free
port)! Consider explicitly setting the appropriate binding address for
the service 'sparkDriver' (for example spark.driver.bindAddress for
SparkDriver) to the correct binding address.
        at sun.nio.ch.Net.bind0(Native Method)
        at sun.nio.ch.Net.bind(Net.java:438)
        at sun.nio.ch.Net.bind(Net.java:430)
        at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:225)
        at 
io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:134)
        at 
io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:550)
        at 
io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1334)
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:506)
        at 
io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:491)
        at 
io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:973)
        at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:248)
        at 
io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:356)
        at 
io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164)
        at 
io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500)
        at 
io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
        at 
io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
        at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.lang.Thread.run(Thread.java:748)
2021-11-12 17:20:37,308 INFO util.ShutdownHookManager: Shutdown hook called


Le ven. 12 nov. 2021 à 16:43, Prabhu Joseph <[email protected]> a
écrit :

> Can you share the exception seen from the spark application logs. Thanks.
>
> On Fri, Nov 12, 2021, 7:24 PM marc nicole <[email protected]> wrote:
>
>> Hi Guys !
>>
>> if i specify bindAddress in the spark-defaults.conf then for YARN (client
>> mode) everything works fine and the applicationMaster finds the driver.
>> But
>> if i submit cluster mode then the applicationMaster, if hosted on worker
>> nodes, won't find the driver and results in bind error.
>>
>>
>>
>> Any idea what's the missing config ?
>>
>>
>> To note that i create the driver through a SparkSession object (not a
>> SparkContext).
>>
>> Hint i was thinking a propagation of the driver config to the worker would
>> solve this e.g. through spark.yarn.dist.files
>>
>> Any suggestions here ?
>>
>

Reply via email to