Hello John To add to Chris' email:
Do take a look at http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html - this is probably a bit of date. - the actual source code of distributed-shell in the source tree would be the best guideline to follow after taking a brief look at the link above. Compatibility - 0.23 and 2.0 are similar to a large extent but there are differences - not sure if it is possible to code for compatibility. - To get apis into a relatively stable state, a lot of changes have gone in since 2.0.4 was released Task output files - the files are served by an auxiliary service ( mapreduce shuffle service ) running within the NodeManager. - The NM needs to be configured to tell it which aux services to start up. - The protocols support some level of information passing via the service data constructs. - the service is notified when an application completes such that it can be used to delete data if needed -- Hitesh On May 23, 2013, at 3:45 PM, John Lilley wrote: > I am getting started with development of a custom ApplicationMaster and I > didn't think that the user@ list was quite the right place for it. Apologies > if this list isn't the right place either. Some of my questions are really > newbie, like: > > * Is there an FAQ for non-MR YARN development? > > * Is there an FAQ for configuring/building/running Hadoop from > source, preferably in Eclipse? > > * What is the recommended configuration/environment for development > of a YARN app? I would like to use Eclipse under Windows if that even makes > any sense. > > * Would you start with a Hadoop release or build from version control? > > * Is it possible to code for compatibility between 2.0 and 0.23? > > * Is there an ApplicationMaster example that can be used as a > starting point? > I also have some more in-depth questions: > > * When a MapReduce task creates its output files and makes them > available over HTTP, is it the NodeManager that serves them up? If my YARN > task wants to do something similar, how does it tell the NodeManager? How > are the files removed later? > > * Is it possible to install objects or services that run as peers of > the NodeManager as opposed to tasks? Are there any recommended per-node > patterns as opposed to per-task patterns? > > Thanks > John >