[
https://issues.apache.org/jira/browse/YARN-2938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14243824#comment-14243824
]
Varun Saxena commented on YARN-2938:
------------------------------------
bq. Is the newly adde entries in {{findbugs-exclude.xml}} fixable? Would you
mind sharing the detail findbugs reports about these issues?
Sure [~zjshen]. Kindly find the details as under :
- *VO_VOLATILE_INCREMENT* in
{{org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode}}
{{numContainers}} in SchedulerNode is volatile. And this variable was being
incremented and decremented which Findbugs complained about. But increment and
decrement happens inside synchronized methods so should not be an issue.
{code}
public synchronized void allocateContainer(RMContainer rmContainer) {
Container container = rmContainer.getContainer();
deductAvailableResource(container.getResource());
++numContainers;
....
}
...
private synchronized void updateResource(Container container) {
addAvailableResource(container.getResource());
--numContainers;
}
{code}
- *VO_VOLATILE_INCREMENT* in
*org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.AbstractCSQueue*
Exactly same as issue above. {{numContainers}} incremented and decremented only
inside synchronized methods.
{code}
synchronized void allocateResource(Resource clusterResource,
Resource resource, Set<String> nodeLabels) {
Resources.addTo(usedResources, resource);
......
++numContainers;
CSQueueUtils.updateQueueStatistics(resourceCalculator, this, getParent(),
clusterResource, minimumAllocation);
}
...
protected synchronized void releaseResource(Resource clusterResource,
Resource resource, Set<String> nodeLabels) {
// Update queue metrics
Resources.subtractFrom(usedResources, resource);
....
CSQueueUtils.updateQueueStatistics(resourceCalculator, this, getParent(),
clusterResource, minimumAllocation);
--numContainers;
}
{code}
- *VO_VOLATILE_INCREMENT* in
*org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue*
Exactly same as issue above. {{numApplications}} incremented and decremented
only inside synchronized methods.
{code}
private synchronized void addApplication(ApplicationId applicationId,
String user) {
++numApplications;
LOG.info("Application added -" +
" appId: " + applicationId +
" user: " + user +
" leaf-queue of parent: " + getQueueName() +
" #applications: " + getNumApplications());
}
......
private synchronized void removeApplication(ApplicationId applicationId,
String user) {
--numApplications;
LOG.info("Application removed -" +
" appId: " + applicationId +
" user: " + user +
" leaf-queue of parent: " + getQueueName() +
" #applications: " + getNumApplications());
}
{code}
- For above 3, refer to [False Positive VO_VOLATILE_INCREMENT when
synchronized|http://sourceforge.net/p/findbugs/bugs/1032/]
- *RCN_REDUNDANT_NULLCHECK_OF_NONNULL_VALUE* in
{{org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler}}.
Findbugs is complaining about null check for {{containerReports}} in below
piece of code.
{code}
public synchronized void recoverContainersOnNode(
List<NMContainerStatus> containerReports, RMNode nm) {
if (!rmContext.isWorkPreservingRecoveryEnabled()
|| containerReports == null
|| (containerReports != null && containerReports.isEmpty())) {
return;
}
....
}
{code}
Looking closely at the code flow, function above is called on *NODE_ADDED*
event in {{CapacityScheduler}}, {{FifoScheduler}} and {{FairScheduler}}
{code:title=CapacityScheduler.java|borderStyle=solid}
public void handle(SchedulerEvent event) {
switch(event.getType()) {
case NODE_ADDED:
{
NodeAddedSchedulerEvent nodeAddedEvent = (NodeAddedSchedulerEvent)event;
addNode(nodeAddedEvent.getAddedRMNode());
recoverContainersOnNode(nodeAddedEvent.getContainerReports(),
nodeAddedEvent.getAddedRMNode());
}
...
}
{code}
And this NodeAddedSchedulerEvent is created in below piece of code. In the code
below, {{containers}} will be written to
{{NodeAddedSchedulerEvent#containerReports}}. As can be seen below, containers
can actually be null in some cases.
{code:title=RMNodeImpl.java|borderStyle=solid}
public static class AddNodeTransition implements
SingleArcTransition<RMNodeImpl, RMNodeEvent> {
@Override
public void transition(RMNodeImpl rmNode, RMNodeEvent event) {
// Inform the scheduler
RMNodeStartedEvent startEvent = (RMNodeStartedEvent) event;
List<NMContainerStatus> containers = null;
String host = rmNode.nodeId.getHost();
if (rmNode.context.getInactiveRMNodes().containsKey(host)) {
....
} else {
ClusterMetrics.getMetrics().incrNumActiveNodes();
containers = startEvent.getNMContainerStatuses();
if (containers != null && !containers.isEmpty()) {
for (NMContainerStatus container : containers) {
if (container.getContainerState() == ContainerState.RUNNING) {
rmNode.launchedContainers.add(container.getContainerId());
}
}
}
}
....
rmNode.context.getDispatcher().getEventHandler()
.handle(new NodeAddedSchedulerEvent(rmNode, containers));
....
}
}
{code}
> Fix new findbugs warnings in hadoop-yarn-resourcemanager and
> hadoop-yarn-applicationhistoryservice
> --------------------------------------------------------------------------------------------------
>
> Key: YARN-2938
> URL: https://issues.apache.org/jira/browse/YARN-2938
> Project: Hadoop YARN
> Issue Type: Improvement
> Reporter: Varun Saxena
> Assignee: Varun Saxena
> Fix For: 2.7.0
>
> Attachments: YARN-2938.001.patch, YARN-2938.002.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)