Re: EC2 Custom AMI's connection issue

Andrei Savu Wed, 24 Aug 2011 20:43:27 -0700

Are you specifying a new set of SSH key pairs in the Whirr properties file?


If not by default Whirr will use the keys found in ~/.ssh - id_rsa & id_rsa.pub.

-- Andrei Savu / andreisavu.ro

On Wed, Aug 24, 2011 at 8:02 PM, Joris Poort <gpo...@gmail.com> wrote:
> I think you're probably right its an auth issue - although I was
> expecting a more direct/clear error message if the keypair wasn't
> working.
>
> I created the AMI by taking an EBS snapshot then converting to
> instance-store.  I've tried both the ebs back ami and instance-store
> with the same results.  My understanding is that the keypair used to
> create the AMI is generally one of the accepted keys in addition to
> the key pair used to launch the instance created by jclouds.  I'm not
> sure how to confirm this for sure - is the jclouds keypair stored
> anywhere that can be used to test this?
>
> Thanks again for your help,
>
> Joris
>
> On Wed, Aug 24, 2011 at 7:49 PM, Andrei Savu <savu.and...@gmail.com> wrote:
>> I'm not sure but it looks like an auth issue to me. Whirr creates it's
>> own key pair using the local SSH keys as specified in the properties
>> file.
>>
>> You've created the custom ami by taking an EBS snapshot? Can you use
>> that custom ami with a different key pair?
>>
>> -- Andrei Savu / andreisavu.ro
>>
>>
>> On Wed, Aug 24, 2011 at 7:31 PM, Joris Poort <gpo...@gmail.com> wrote:
>>> Andrei - thanks for the response!
>>>
>>> I logged into the custom AMI using ssh and a key pair on my local
>>> machine (I'm executing whirr via ubuntu virtual machine).  I've tried
>>> both spot instances and regular instances and am getting the same
>>> behavior.
>>>
>>> Full output on terminal looks like (lines between "Starting 1 node"
>>> and "Dying because" are not always there):
>>> Starting 1 node(s) with roles [hadoop-datanode, hadoop-tasktracker]
>>> Starting 1 node(s) with roles [hadoop-namenode, hadoop-jobtracker]
>>> Dying because - net.schmizz.sshj.transport.TransportException: Broken
>>> transport; encountered EOF
>>> Dying because - net.schmizz.sshj.transport.TransportException: Broken
>>> transport; encountered EOF
>>> <<kex done>> woke to: net.schmizz.sshj.transport.TransportException:
>>> Broken transport; encountered EOF
>>> << (root@174.129.128.120:22) error acquiring
>>> SSHClient(root@174.129.128.120:22): Broken transport; encountered EOF
>>> net.schmizz.sshj.transport.TransportException: Broken transport; 
>>> encountered EOF
>>>        at net.schmizz.sshj.transport.Reader.run(Reader.java:70)
>>> Dying because - java.net.SocketTimeoutException: Read timed out
>>> Dying because - java.net.SocketTimeoutException: Read timed out
>>> Dying because - java.net.SocketTimeoutException: Read timed out
>>> Dying because - java.net.SocketTimeoutException: Read timed out
>>>
>>> Last few entries on whirr.log:
>>> 2011-08-24 19:20:05,428 DEBUG [jclouds.compute] (pool-3-thread-2) >>
>>> requesting 1 spot instances region(us-east-1) price(0.250000)
>>> spec([instanceType=m1.large, imageId=ami-d1e525b8, kernelId=null,
>>> ramdiskId=null, availabilityZone=null,
>>> keyName=jclouds#hadoop_custom_spot_1#us-east-1#45,
>>> securityGroupIdToNames={}, blockDeviceMappings=[],
>>> securityGroupIds=[],
>>> securityGroupNames=[jclouds#hadoop_custom_spot_1#us-east-1],
>>> monitoringEnabled=null, userData=null]) options([formParameters={}])
>>> 2011-08-24 19:20:05,642 DEBUG [jclouds.compute] (pool-3-thread-4) <<
>>> started instances([region=us-east-1, name=sir-4f589c11])
>>> 2011-08-24 19:20:05,682 DEBUG [jclouds.compute] (pool-3-thread-2) <<
>>> started instances([region=us-east-1, name=sir-59cec612])
>>> 2011-08-24 19:20:05,864 DEBUG [jclouds.compute] (pool-3-thread-4) <<
>>> present instances([region=us-east-1, name=sir-4f589c11])
>>> 2011-08-24 19:20:05,917 DEBUG [jclouds.compute] (pool-3-thread-2) <<
>>> present instances([region=us-east-1, name=sir-59cec612])
>>> 2011-08-24 19:27:18,150 DEBUG [jclouds.compute] (user thread 8) >>
>>> blocking on socket [address=50.17.135.8, port=22] for 600000 seconds
>>> 2011-08-24 19:27:21,132 DEBUG [jclouds.compute] (user thread 7) >>
>>> blocking on socket [address=174.129.128.120, port=22] for 600000
>>> seconds
>>> 2011-08-24 19:27:24,222 DEBUG [jclouds.compute] (user thread 7) <<
>>> socket [address=174.129.128.120, port=22] opened
>>> 2011-08-24 19:27:32,255 DEBUG [jclouds.compute] (user thread 8) <<
>>> socket [address=50.17.135.8, port=22] opened
>>>
>>> After ssh onto node,  didn't find any logs in /tmp.
>>>
>>> Thanks again for any help on this!
>>>
>>> Joris
>>>
>>> On Wed, Aug 24, 2011 at 7:12 PM, Andrei Savu <savu.and...@gmail.com> wrote:
>>>> I suspect this is an authentication issue. How do you login to the custom 
>>>> AMI?
>>>>
>>>> Also check whirr.log for more details and on the remote machines look
>>>> in /tmp for jclouds script execution logs.
>>>>
>>>> I know from IRC that you are using spot instances. Are you seeing the
>>>> same behavior with regular ones?
>>>>
>>>> -- Andrei Savu / andreisavu.ro
>>>>
>>>>
>>>> On Wed, Aug 24, 2011 at 7:07 PM, Joris Poort <gpo...@gmail.com> wrote:
>>>>> Hi,
>>>>>
>>>>> I'm new to whirr and I'm running custom AMI configuration (application
>>>>> installed on working canonical image).  Executing with whirr 0.6.0
>>>>> everything executes fine until I get the following error:
>>>>> "Dying because - java.net.SocketTimeoutException: Read timed out"
>>>>>
>>>>> The instances are running fine, I can ssh into them, but the whirr
>>>>> code stalls and I get the above error (2x number of instances), no
>>>>> proxy shell is created.  If I run the exact same code with vanilla
>>>>> canonical images I don't have any issues.
>>>>>
>>>>> Anyone have any ideas on things to test, debug or work around this?
>>>>> Would really appreciate it!
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Joris
>>>>>
>>>>
>>>
>>
>

Re: EC2 Custom AMI's connection issue

Reply via email to