[ 
https://issues.apache.org/jira/browse/WHIRR-189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13018974#comment-13018974
 ] 

Tom White commented on WHIRR-189:
---------------------------------

On EC2, I noticed that for a m1.large instance jclouds reports that there are 
two local volumes /dev/sdb, and /dev/sdc (for EBS-backed images), even though 
/dev/sdc is not present on the instance. This is explained by the second note 
on 
http://docs.amazonwebservices.com/AWSEC2/latest/UserGuide/index.html?instance-storage-concepts.html.
 We can use EC2TemplateOptions to map all the ephemeral devices (but it's odd 
that jclouds reports a device that isn't actually mapped). This will require 
EC2-specific code that knows about the emphemeral devices on each instance size 
- I wonder if jclouds abstracts this?

Once the above is sorted out, I imagine the implementation would include all 
the non-boot-device volumes for its storage. HadoopConfigurationBuilder would 
set dfs.data.dir, dfs.name.dir, and mapred.local.dir to use all the volumes. 
And the volumes would need mounting/symlinking (and possibly formatting in the 
case of EC2) using scripts like (this code is based on code from the Python 
scripts):

{code}
# TODO: make sure that mkfs.xfs is installed
function prep_disk() {
  mount=$1
  device=$2
  automount=${3:-false}

  if [ $(mountpoint -q -x $device) ]; then
    echo "$device is mounted"
    if [ ! -d $mount ]; then
      echo "No mount"
      ln -s $(grep $device /proc/mounts | awk '{print $2}') $mount
    fi
  else
    echo "warning: ERASING CONTENTS OF $device"
    mkfs.xfs -f $device
    if [ ! -e $mount ]; then
      mkdir $mount
    fi
    mount -o defaults,noatime $device $mount
    if $automount ; then
      echo "$device $mount xfs defaults,noatime 0 0" >> /etc/fstab
    fi
  fi
}

prep_disk /data1 /dev/sdb true
prep_disk /data2 /dev/sdc true
{code}

> Hadoop on EC2 should use all available local storage
> ----------------------------------------------------
>
>                 Key: WHIRR-189
>                 URL: https://issues.apache.org/jira/browse/WHIRR-189
>             Project: Whirr
>          Issue Type: Improvement
>          Components: service/hadoop
>            Reporter: Tom White
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to