[
https://issues.apache.org/jira/browse/YARN-4478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15257541#comment-15257541
]
Rohith Sharma K S commented on YARN-4478:
-----------------------------------------
Thanks [~vinodkv] for your attention on this JIRA. Yes, I will close this
Umbrella Jira since it is going very long and I think compared to where we
started this tracking, test case runs seem better.
Recently I had discussion with [~sunilg] about various test failure root causes
and made observations from this Umbrella JIRA. Some of the test case failure
which are said to be *random*, seems to be fixable. I will share some of the
observations made while fixing/reviewing/committing test cases in this umbrella
JIRA.
Types of failures seen
# Yarn event model - AsyncDispatcher : Most of the random test failures seen in
this category. For example, After registering node to RM, asserting for cluster
resource from scheduler.
{code}
rm.start();
rm.registerNode("h1:1234", 5120);
assertEquals(5120,rm.getResourceScheduler().getClusterResource());
{code}
Many a times, contributors forget while writing test cases that yarn events are
async. Many random failures are because of these events processing delay which
seems running in local eclipse tests.
# System Settings : We have seen few test case fails regularly. Mainly because
of DNS configurations. See HADOOP-12687 and INFRA-11150. This I am not sure
how should it be resolved since neither code check in preferred since it breaks
RFC standards.
# As test cases were made running in parrellel, we ran into "Address bind
exception" issues.
# MockRM APIs : In MockRM many API's are there to submit job and lunch AM which
is added over time. Few such methods are internally waiting for some events to
happen and few others explicitly need to wait for these events from test case
(contributors has to take care of this). For test case writing, contributors
should be aware of MockRM#API what it does internally. And we have seen a mix
of these apis causing random failures.
For handling open issues in this Umbrella, I will detach it and make it as Test
bug. Let us handle these separately.
> [Umbrella] : Track all the Test failures in YARN
> ------------------------------------------------
>
> Key: YARN-4478
> URL: https://issues.apache.org/jira/browse/YARN-4478
> Project: Hadoop YARN
> Issue Type: Bug
> Components: yarn
> Reporter: Rohith Sharma K S
>
> Recently many test cases are failing either timed out or new bug fix caused
> impact. Many test faiures JIRA are raised and are in progress.
> This is to track all the test failures JIRA's
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)