[Yahoo-eng-team] [Bug 1633990] Re: resize or migrate instance will get failed if two compute hosts are set in different rbd pool

Sean Dague Fri, 09 Dec 2016 07:32:19 -0800

Until the resource providers work is done, there is no fix for this.
Under current architecture this is a Won't Fix


** Changed in: nova
       Status: New => Won't Fix

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1633990

Title:
  resize or migrate instance will get failed if two compute hosts are
  set in different rbd pool

Status in OpenStack Compute (nova):
  Won't Fix

Bug description:
  We are now facing a nova operation issue about setting different ceph rbd 
pool to each corresponding nova compute node in one available zone. For 
instance:
  (1) compute-node-1  in az1 and set images_rbd_pool=pool1
  (2) compute-node-2  in az1 and set images_rbd_pool=pool2
  This setting can normally work fine.

  But problem encountered when doing resize/migrate instance. For instance, 
when try to resize an instance-1 originally in compute-node-1, then nova will 
do schedule procedure, assuming that nova-scheduler get the chosen compute node 
is compute-node-2. Then the nova will get the following error:
  http://paste.openstack.org/show/585540/.

  This exception is because that in compute-node-2 nova can't find pool1 vm1 
disk. So is there a way nova can handle this? Similar thing in cinder, you may 
see a cinder volume has host attribute like:
  host_name@pool_name#ceph.

  Why we use such setting is because that while doing storage capacity
  expansion we want to avoid the influence of ceph rebalance.

  One solution we found is AggregateInstanceExtraSpecsFilter, this can 
coordinate working with Host Aggregates metadata and flavor metadata.
  We try to create Host Aggregates like:
  az1-pool1 with hosts compute-node-1, and metadata {ceph_pool: pool1};
  az1-pool2 with hosts compute-node-2, and metadata {ceph_pool: pool2};
  and create flavors like:
  flavor1-pool1 with metadata {ceph_pool: pool1};
  flavor2-pool1 with metadata {ceph_pool: pool1};
  flavor1-pool2 with metadata {ceph_pool: pool2};
  flavor2-pool2 with metadata {ceph_pool: pool2};

  But this may introduce a new issue about the create_instance. Which flavor 
should be used? The business/application layer seems need to add it's own 
flavor scheduler. And this can also
  cause a compute service capacity issue. If choice one flavor to resize, the 
scheduler will use the AggregateInstanceExtraSpecsFilter to limit the 
destination host to the same rbd pool, what if there is no available compute 
host? what if there has no enough memory or CPU? So this is not the best 
solution.

  So here finally, I want to ask, if there is a best practice about
  using multiple ceph rbd pools in one available zone.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1633990/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1633990] Re: resize or migrate instance will get failed if two compute hosts are set in different rbd pool

Reply via email to