[
https://issues.apache.org/jira/browse/YARN-10532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17267111#comment-17267111
]
zhuqi commented on YARN-10532:
------------------------------
The latest patch, double check the
"An additional requirement we should keep in mind:
Scenario A:
{code:java}
- At time T0, policy signals scheduler to delete queue A (an auto created
queue).
- Before the signal arrives to scheduler, an app submitted to scheduler (T1).
T1 > T0
- When at T2 (T2 > T1), the signal arrived at scheduler, scheduler should avoid
removing the queue A because now it is used.{code}
Scenario B:
{code:java}
- At time T0, policy signals scheduler to delete queue A (an auto created
queue).
- At T1 (T1 > T0), scheduler got the signal and deleted the queue.
- At T2 (T2 > T1), an app submitted to scheduler.
Scheduler should immediately recreate the queue, in another word, deleting an
dynamic queue should NEVER fail a submitted application.{code}
"
This will not happen:
Scenario A confirmed by :
Double check before deletion, pass the latest last submitted time, and get
before remove again and compare them. All will in the queue write lock.
{code:java}
// Double check for the lastSubmitTime has been expired.
// In case if now, there is a new submitted app.
if (queue instanceof LeafQueue &&
((LeafQueue) queue).isDynamicQueue()) {
LeafQueue underDeleted = (LeafQueue)queue;
if (underDeleted.getLastSubmittedTimestamp() != lastSubmittedTime) {
throw new SchedulerDynamicEditException("This should not happen, " +
"trying to remove queue= " + childQueuePath
+ ", however the queue has new submitted apps.");
}
} else {
throw new SchedulerDynamicEditException(
"This should not happen, can't remove queue= " + childQueuePath
+ " is not a leafQueue or not a dynamic queue.");
}
// Now we can do remove and update
this.childQueues.remove(queue);
this.scheduler.getCapacitySchedulerQueueManager()
.removeQueue(queue.getQueuePath());
{code}
Signal will also update this in the write lock:
{code:java}
@Override
public void submitApplication(ApplicationId applicationId, String userName,
String queue) throws AccessControlException {
// Careful! Locking order is important!
validateSubmitApplication(applicationId, userName, queue);
// Signal to queue submit time in dynamic queue
if (this.isDynamicQueue()) {
signalToSubmitToQueue();
}
// Inform the parent queue
try {
getParent().submitApplication(applicationId, userName, queue);
} catch (AccessControlException ace) {
LOG.info("Failed to submit application to parent-queue: " +
getParent().getQueuePath(), ace);
throw ace;
}
}
// "Tab" the queue, so this queue won't be removed because of idle timeout.
public void signalToSubmitToQueue() {
writeLock.lock();
try {
this.lastSubmittedTimestamp = System.currentTimeMillis();
} finally {
writeLock.unlock();
}
}
{code}
Scenario B confirmed by :
in addApplication
and addApplicationOnRecovery.
{code:java}
// If the queue has been deleted for expired.
// - At time T0, policy signals scheduler to delete queue A (an auto created
queue).
// - At T1 (T1 > T0), scheduler got the signal and deleted the queue.
// - At T2 (T2 > T1), an app submitted to scheduler.
//
// Scheduler should immediately recreate the queue, in another word,
// deleting an dynamic queue should NEVER fail a submitted application.
// This case queue may be null later
// So add queue write lock here
try {
((AbstractCSQueue) queue).writeLock.lock();
}...{code}
> Capacity Scheduler Auto Queue Creation: Allow auto delete queue when queue is
> not being used
> --------------------------------------------------------------------------------------------
>
> Key: YARN-10532
> URL: https://issues.apache.org/jira/browse/YARN-10532
> Project: Hadoop YARN
> Issue Type: Sub-task
> Reporter: Wangda Tan
> Assignee: zhuqi
> Priority: Major
> Attachments: YARN-10532.001.patch, YARN-10532.002.patch,
> YARN-10532.003.patch
>
>
> It's better if we can delete auto-created queues when they are not in use for
> a period of time (like 5 mins). It will be helpful when we have a large
> number of auto-created queues (e.g. from 500 users), but only a small subset
> of queues are actively used.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]