Hello,

I am using Samza on Yarn and have an issue where I see 2 jobs processed for
every incoming event. The events are kept on Kafka Raw topic. Samza Yarn
job processes it into a processed queue.
The issue is that I see 2 jobs in processed queue for every raw message.

Some observations:
I see that there are 2 running applications and 9998 apps pending. To my
understanding, as there are 2 running jobs, it constitutes to the duplicity.
When I kill a running app, another app takes its place (from the pending
queue).
I have looked at Yarn documentation -
https://hadoop.apache.org/docs/r2.4.1/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html
and
tried changing the scheduler config to have the maximum application as 2
but it does not seem to take effect.
What is the best way to handle this scenario? I want to kill the redundant
app job and ensure only 1 runs.

Appreciate any inputs.

- Shekar

Reply via email to