[ 
https://issues.apache.org/jira/browse/YARN-5576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15448580#comment-15448580
 ] 

Varun Vasudev commented on YARN-5576:
-------------------------------------

Thanks for the patch [~jianhe]! 
1) Can you please look at the javac errors?

2) 
{code}
+    if (!container.getContainerState().equals(
+        org.apache.hadoop.yarn.server.nodemanager.
+            containermanager.container.ContainerState.RUNNING)) {
+      throw new YarnException(
+          containerId + " is at " + container.getContainerState()
+              + " state. Not able to localize new resources.");
+    }
{code}

Can we move this logic into the ContainerImpl class? Something line {code} 
if(!container.canLocalizeResources()) { {code}
It puts all the logic for this stuff in one place. It looks like similar logic 
is required in the ResourceLocalization class?

{code}
+    EnumSet<ContainerState> set =
+        EnumSet.of(ContainerState.LOCALIZING, ContainerState.RUNNING);
+    if (!set.contains(c.getContainerState())) {
+      LOG.warn(c.getContainerId() + " is at " + c.getContainerState()
+          + " state, do not localize resources.");
+      return;
+    }
{code}

3)
{code}
@@ -844,22 +772,21 @@ public ContainerState transition(ContainerImpl container,
       ContainerResourceLocalizedEvent rsrcEvent = 
(ContainerResourceLocalizedEvent) event;
{code}
{code}
-      container.localizedResources.put(location, sys);
{code}
I couldn't figure out where this logic was moved to - can you please explain?

4)
{code}
+          if (new File(linkFile).exists()) {
+            LOG.info("Symlink file already exists: " + linkFile);
+          } else {
{code}
We should throw an error here or at least flag this as a failed localization?

5)
{code}
+      container.diagnostics.append(failedEvent.getDiagnosticMessage());
{code}
We need to check the diagnostics string size - if a AM sends us too many failed 
requests, the diagnostics string will just balloon in size.

6)
{code}
+            if (localizer != null && localizer.killContainerLocalizer.get()) {
+              LOG.info("New " + event.getType() + " localize request for "
+                  + locId + ", remove old private localizer.");
+              cleanupPrivLocalizers(locId);
+              localizer = null;
+            }
{code}
Can you explain the logic for this? I couldn't figure out why we need this.

7)
{code}
+    System.out.println("==================================");
+    System.out.println(appDir.getAbsolutePath());
+    System.out.println(appSysDir.getAbsolutePath());
+    System.out.println(containerDir.getAbsolutePath());
+    System.out.println(containerSysDir.getAbsolutePath());
+    System.out.println(targetFile.getAbsolutePath());
+    System.out.println("==================================");
{code}
Do we need these lines in the test code? Maybe move them to LOG.debug?

8)
Can you also add a check in testLocalingResourceWhileContainerRunning to make 
sure we can’t localize for non-running containers?

> Core change to localize resource while container is running
> -----------------------------------------------------------
>
>                 Key: YARN-5576
>                 URL: https://issues.apache.org/jira/browse/YARN-5576
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Jian He
>            Assignee: Jian He
>         Attachments: YARN-5576.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to