Bikas, Thank you for your response. I've submitted a JIRA ticket here with steps to reproduce (https://issues.apache.org/jira/browse/YARN-466). Unfortunately, I don't have bandwidth to work on a patch.
Cheers, Roger On Mon, Mar 11, 2013 at 10:58 AM, Bikas Saha <[email protected]> wrote: > Thanks for the investigation. Could you please open a JIRA to track this. > Would be great if you can add repro steps. Of course, patch to fix is most > welcome. > > Bikas > > -----Original Message----- > From: Roger Hoover [mailto:[email protected]] > Sent: Friday, March 08, 2013 5:12 PM > To: [email protected] > Subject: ResourceManager not matching host names > > Hi, > > I'm having a issue with matching hostnames in the scheduler for slave > nodes whose hostnames do not match their fully qualified domain names. > > The problem is that the ResourceManager learns the hostname of the node > when the NodeManager registers itself and it seems the node manager is > getting the hostname by asking the OS. When a job is submitted, I think > the ApplicationMaster learns the hostname by doing a reverse DNS lookup > based on the slaves file. > > Therefore, the ApplicationMaster submits requests for containers using the > fully qualified domain name (foo.bar.com) but the scheduler uses the OS > hostname (foo) when checking to see if any requests are node-local. > > What's the recommended solution? > a) Always make sure you configure you clusters such that hostnames match > reverse DNS? > b) Create a way for the ApplicationMaster and NodeManager to agree on > hostnames? > > Thanks, > > Roger >
