Sihai Ke created YARN-9550:
------------------------------
Summary: Suspect wrong way to calculater container utilized vcore.
Key: YARN-9550
URL: https://issues.apache.org/jira/browse/YARN-9550
Project: Hadoop YARN
Issue Type: Bug
Components: nodemanager
Affects Versions: 2.9.1
Reporter: Sihai Ke
In hadoop 2.9.1 class *ContainersMonitorImpl* line 664, I suspect it use the
wrong way to calculate the milliVcoresUsed, below is the code.
{code:java}
ResourceCalculatorProcessTree pTree = ptInfo.getProcessTree();
pTree.updateProcessTree(); // update process-tree
if (!pTree.isValidData()) {
// If we cannot get the data for one container, we ignore it all
LOG.error("Cannot get the data for " + pId);
trackedContainersUtilization = null;
continue;
}
long currentVmemUsage = pTree.getVirtualMemorySize();
long currentPmemUsage = pTree.getRssMemorySize();
// if machine has 6 cores and 3 are used,
// cpuUsagePercentPerCore should be 300% and
// cpuUsageTotalCoresPercentage should be 50%
float cpuUsagePercentPerCore = pTree.getCpuUsagePercent();
if (cpuUsagePercentPerCore < 0) {
// CPU usage is not available likely because the container just
// started. Let us skip this turn and consider this container
// in the next iteration.
LOG.info("Skipping monitoring container " + containerId
+ " since CPU usage is not yet available.");
continue;
}
float cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore /
resourceCalculatorPlugin.getNumProcessors();
// Multiply by 1000 to avoid losing data when converting to int
int milliVcoresUsed = (int) (cpuUsageTotalCoresPercentage * 1000
* maxVCoresAllottedForContainers /nodeCpuPercentageForYARN);
// milliPcoresUsed = (int) (cpuUsagePercentPerCore * 1000 / 100;
// As cpuUsagePercentagePerCore use 100 to represent 1 single core.
int milliPcoresUsed = (int) (cpuUsagePercentPerCore * 10);
// as processes begin with an age 1, we want to see if there
// are processes more than 1 iteration old.
vcoresUsageByAllContainers += milliVcoresUsed;
pcoresByAllContainers += milliPcoresUsed;
{code}
I think
{code:java}
int milliVcoresUsed = (int) (cpuUsageTotalCoresPercentage * 1000 *
maxVCoresAllottedForContainers /nodeCpuPercentageForYARN);{code}
should be
{code:java}
int milliVcoresUsed = (int) (cpuUsageTotalCoresPercentage * 1000 *
maxVCoresAllottedForContainers;
{code}
I think it need not to divide nodeCpuPercentageForYARN, [~kasha], looks you add
this feature, could you help to have a look ? or could you educate me if I am
wrong ?
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]