[ 
https://issues.apache.org/jira/browse/YARN-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13719003#comment-13719003
 ] 

Bikas Saha commented on YARN-744:
---------------------------------

Patch needs rebase. Probably reuse the recently added UnknownApplication 
exception.
Can we pick a better name?
{code}AllocateResponseWrapper res{code}

If the wrapper exists then how can the lastResponse be null?
{code}
+    synchronized (res) {
+      AllocateResponse lastResponse = res.getAllocateResponse();
+      if (lastResponse == null) {
+        LOG.error("AppAttemptId doesnt exist in cache " + appAttemptId);
+        return resync;
+      }
{code}

I am not quite getting fully what the test is doing. Does it fail without the 
change?

On the test code itself. 
Is there another solution to make the test work other than making a method 
protected only for test override purposes? 
Can you make it package private instead of protected? protected implies 
connotations for derived classes.
Can we avoid the long sleep() calls and use wait-notify or countdown latches so 
that we dont waste time in the test. They also help make the test less flaky 
because of race conditions.
{code}
+          @Override
+          protected void authorizeRequest(ApplicationAttemptId appAttemptID)
+              throws YarnException {
+            int interval = 10;
+            count.incrementAndGet();
+            while (count.get() == 1 && interval-- > 0 ) {
+              try {
+                Thread.sleep(1000);
+              } catch (InterruptedException e) {}
+            }
+            Assert.assertTrue(count.get() > 1);
+          }
+        };
{code}

                
> Race condition in ApplicationMasterService.allocate .. It might process same 
> allocate request twice resulting in additional containers getting allocated.
> ---------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-744
>                 URL: https://issues.apache.org/jira/browse/YARN-744
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>            Reporter: Bikas Saha
>            Assignee: Omkar Vinit Joshi
>            Priority: Minor
>         Attachments: MAPREDUCE-3899-branch-0.23.patch, 
> YARN-744-20130711.1.patch, YARN-744-20130715.1.patch, YARN-744.patch
>
>
> Looks like the lock taken in this is broken. It takes a lock on lastResponse 
> object and then puts a new lastResponse object into the map. At this point a 
> new thread entering this function will get a new lastResponse object and will 
> be able to take its lock and enter the critical section. Presumably we want 
> to limit one response per app attempt. So the lock could be taken on the 
> ApplicationAttemptId key of the response map object.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to