Hi Sandeep, If you haven't already seen it, then you might find it helpful to look at the documentation on writing a YARN application. Since MapReduce is implemented in terms of a YARN application, it might be helpful to see the simpler example presented in this documentation before attempting to understand MapReduce.
http://hadoop.apache.org/docs/r2.2.0/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html As far as task spawning, the responsibilities are split across several classes in the hadoop-mapreduce-client-app sub-project. Here are a few classes that I think are relevant: RMContainerRequestor is responsible for obtaining the YARN containers for running MapReduce tasks. This is where you'll find the code that uses AllocateRequest to ask the ResourceManager for containers. ContainerLauncherImpl is responsible for actually launching the containers. This is where you'll find the code that uses StartContainerRequest to ask a NodeManager to run a task. TaskAttemptImpl is responsible for configuring exactly what gets run in one of the containers. This is where you'll find the code that uses ContainerLaunchContext to set up the exact commands to run for the task (either map or reduce). MapReduceChildJVM is also significant as a helper class. TaskAttemptImpl calls this to do things like setting up the task's environment variables and the exact launch command. Hope this helps! Chris Nauroth Hortonworks http://hortonworks.com/ On Thu, Feb 20, 2014 at 9:47 PM, Sandeep Kandula <[email protected]> wrote: > Hi, > I am new to Hadoop and *I am interested in finding the code where the > reduce and map tasks are spawn*. > Towards this goal I have been going through the MapReduce, YARN source code > for the past few days. I have started from the NodeManager class and found > it launches containers on the corresponding node. MRAppMaster class is run > by the launch_container.sh script downloaded on each of the nodes. I have > observed that statemachines are used for the transition of a job, task and > each of these transitions affect the state of the object. > But I haven't really found a specific location in the code base where the > map and reduce tasks are spawn. > Any help in this regard is much appreciated. > > Thanks, > Sandeep > -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
