[
https://issues.apache.org/jira/browse/YARN-624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13663125#comment-13663125
]
Carlo Curino commented on YARN-624:
-----------------------------------
I have two level of comments, the first is to clarify the intent of my earlier
messages, and the second one to match robert description of a use case for ML
frameworks.
Intent:
[~vinodkv], I completely agree with you that we should be very deliberate in
choosing what use cases to support and make sure we only add features that
target concrete and I would argue imminent use cases.
Reflecting on a conversation I had with Alejandro, I was trying to help this
conversation to take this form:
1) push for a broad discussion of what are the use cases for gang-scheduling we
know of, so that we understand the entire complexity of the problem (hence the
comments around more advanced feature such as OR of gangs)
2) let a set of core features emerge from the most concrete short-term needs we
have (the storm example is a good example of where to start for this)
3) try to devise a protocol that supports the core features well, but that is
amenable to future expansions (inasmuch as we can guess our future needs based
on 1)
So in term of concrete actions I am totally aligned with your request for
"groundedness", but I think it would really benefit us to spell out also some
of the future requirements
so that we have a chance to designed for extensibility (similarly to what you
guys pushed for in YARN-45, which I thought was really a good call).
ML Use Cases:
I asked Markus Weimer (ML/Systems guy in our group) to summarize why he sees
gang scheduling to be key for ML frameworks (which I think are going to flock
into yarn in the coming months/years).
Here his response:
"In many iterative algorithms, it is imperative to load all the data into the
main memory to minimize execution time. This is true for systems like Giraph,
Mahout and many others that will over time be on YARN. In order to satisfy
their memory requirement, they will block holding on to idle slots until YARN
has delivered all the resources needed. Exposing that pattern via gang
scheduling seems beneficial.
Furthermore, these systems are often communications intensive. Hence, they’d
benefit from a gang of containers that are collocated on the network. This is a
gang-wide property of the resource ask that cannot be captured easily without
gang scheduling. The alternatives (e.g. getting a container on each rack, then
expand from there to see which rack “wins”) are quite wasteful in comparison.
Lastly, scheduling with alternatives at the gang level would be helpful. If
e.g. the training data for a machine learning algorithm needs 128GB of RAM, any
combination of containers with that amount of RAM would satisfy the need.
However, preference is given to fewer machines as that reduces the
communication overhead."
While I appreciate the level of urgency for what Markus describe and for Storm
is not comparable, I see ML as an important future use case for YARN. And
gang-scheduling seems one of those features that will determine whether people
build on Yarn or on something like Mesos.
> Support gang scheduling in the AM RM protocol
> ---------------------------------------------
>
> Key: YARN-624
> URL: https://issues.apache.org/jira/browse/YARN-624
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: api, scheduler
> Affects Versions: 2.0.4-alpha
> Reporter: Sandy Ryza
> Assignee: Sandy Ryza
>
> Per discussion on YARN-392 and elsewhere, gang scheduling, in which a
> scheduler runs a set of tasks when they can all be run at the same time,
> would be a useful feature for YARN schedulers to support.
> Currently, AMs can approximate this by holding on to containers until they
> get all the ones they need. However, this lends itself to deadlocks when
> different AMs are waiting on the same containers.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira