Try ssh -vv to see what is really happening. And of course make sure id_rsa is only readable by the user executing ssh.

The whirr.log appears in the curr. dir. where the whirr command was run, not on the spawned nodes.


Paul

On 20110824 21:01 , Joris Poort wrote:
I'm just using the defaults.  But you may be onto the problem here,
when I try to ssh into the node using:
ssh -i ~/.ssh/id_rsa ubuntu@<public dns>

I get a permission denied.

Any idea how to fix this?  Should I create my own set of SSH key pairs?

Thanks again,

Joris

On Wed, Aug 24, 2011 at 8:42 PM, Andrei Savu<savu.and...@gmail.com>  wrote:
Are you specifying a new set of SSH key pairs in the Whirr properties file?

If not by default Whirr will use the keys found in ~/.ssh - id_rsa&  id_rsa.pub.

-- Andrei Savu / andreisavu.ro

On Wed, Aug 24, 2011 at 8:02 PM, Joris Poort<gpo...@gmail.com>  wrote:
I think you're probably right its an auth issue - although I was
expecting a more direct/clear error message if the keypair wasn't
working.

I created the AMI by taking an EBS snapshot then converting to
instance-store.  I've tried both the ebs back ami and instance-store
with the same results.  My understanding is that the keypair used to
create the AMI is generally one of the accepted keys in addition to
the key pair used to launch the instance created by jclouds.  I'm not
sure how to confirm this for sure - is the jclouds keypair stored
anywhere that can be used to test this?

Thanks again for your help,

Joris

On Wed, Aug 24, 2011 at 7:49 PM, Andrei Savu<savu.and...@gmail.com>  wrote:
I'm not sure but it looks like an auth issue to me. Whirr creates it's
own key pair using the local SSH keys as specified in the properties
file.

You've created the custom ami by taking an EBS snapshot? Can you use
that custom ami with a different key pair?

-- Andrei Savu / andreisavu.ro


On Wed, Aug 24, 2011 at 7:31 PM, Joris Poort<gpo...@gmail.com>  wrote:
Andrei - thanks for the response!

I logged into the custom AMI using ssh and a key pair on my local
machine (I'm executing whirr via ubuntu virtual machine).  I've tried
both spot instances and regular instances and am getting the same
behavior.

Full output on terminal looks like (lines between "Starting 1 node"
and "Dying because" are not always there):
Starting 1 node(s) with roles [hadoop-datanode, hadoop-tasktracker]
Starting 1 node(s) with roles [hadoop-namenode, hadoop-jobtracker]
Dying because - net.schmizz.sshj.transport.TransportException: Broken
transport; encountered EOF
Dying because - net.schmizz.sshj.transport.TransportException: Broken
transport; encountered EOF
<<kex done>>  woke to: net.schmizz.sshj.transport.TransportException:
Broken transport; encountered EOF
<<  (root@174.129.128.120:22) error acquiring
SSHClient(root@174.129.128.120:22): Broken transport; encountered EOF
net.schmizz.sshj.transport.TransportException: Broken transport; encountered EOF
        at net.schmizz.sshj.transport.Reader.run(Reader.java:70)
Dying because - java.net.SocketTimeoutException: Read timed out
Dying because - java.net.SocketTimeoutException: Read timed out
Dying because - java.net.SocketTimeoutException: Read timed out
Dying because - java.net.SocketTimeoutException: Read timed out

Last few entries on whirr.log:
2011-08-24 19:20:05,428 DEBUG [jclouds.compute] (pool-3-thread-2)>>
requesting 1 spot instances region(us-east-1) price(0.250000)
spec([instanceType=m1.large, imageId=ami-d1e525b8, kernelId=null,
ramdiskId=null, availabilityZone=null,
keyName=jclouds#hadoop_custom_spot_1#us-east-1#45,
securityGroupIdToNames={}, blockDeviceMappings=[],
securityGroupIds=[],
securityGroupNames=[jclouds#hadoop_custom_spot_1#us-east-1],
monitoringEnabled=null, userData=null]) options([formParameters={}])
2011-08-24 19:20:05,642 DEBUG [jclouds.compute] (pool-3-thread-4)<<
started instances([region=us-east-1, name=sir-4f589c11])
2011-08-24 19:20:05,682 DEBUG [jclouds.compute] (pool-3-thread-2)<<
started instances([region=us-east-1, name=sir-59cec612])
2011-08-24 19:20:05,864 DEBUG [jclouds.compute] (pool-3-thread-4)<<
present instances([region=us-east-1, name=sir-4f589c11])
2011-08-24 19:20:05,917 DEBUG [jclouds.compute] (pool-3-thread-2)<<
present instances([region=us-east-1, name=sir-59cec612])
2011-08-24 19:27:18,150 DEBUG [jclouds.compute] (user thread 8)>>
blocking on socket [address=50.17.135.8, port=22] for 600000 seconds
2011-08-24 19:27:21,132 DEBUG [jclouds.compute] (user thread 7)>>
blocking on socket [address=174.129.128.120, port=22] for 600000
seconds
2011-08-24 19:27:24,222 DEBUG [jclouds.compute] (user thread 7)<<
socket [address=174.129.128.120, port=22] opened
2011-08-24 19:27:32,255 DEBUG [jclouds.compute] (user thread 8)<<
socket [address=50.17.135.8, port=22] opened

After ssh onto node,  didn't find any logs in /tmp.

Thanks again for any help on this!

Joris

On Wed, Aug 24, 2011 at 7:12 PM, Andrei Savu<savu.and...@gmail.com>  wrote:
I suspect this is an authentication issue. How do you login to the custom AMI?

Also check whirr.log for more details and on the remote machines look
in /tmp for jclouds script execution logs.

I know from IRC that you are using spot instances. Are you seeing the
same behavior with regular ones?

-- Andrei Savu / andreisavu.ro


On Wed, Aug 24, 2011 at 7:07 PM, Joris Poort<gpo...@gmail.com>  wrote:
Hi,

I'm new to whirr and I'm running custom AMI configuration (application
installed on working canonical image).  Executing with whirr 0.6.0
everything executes fine until I get the following error:
"Dying because - java.net.SocketTimeoutException: Read timed out"

The instances are running fine, I can ssh into them, but the whirr
code stalls and I get the above error (2x number of instances), no
proxy shell is created.  If I run the exact same code with vanilla
canonical images I don't have any issues.

Anyone have any ideas on things to test, debug or work around this?
Would really appreciate it!

Cheers,

Joris


Reply via email to