** Changed in: nova
Status: Fix Committed => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1214943
Title:
Live migration should use the same memory over subscription logic as
instance boot
Status in OpenStack Compute (nova):
Fix Released
Bug description:
I encounter an issue when live migrate an instance specified the
target host, i think the operation will be successes , but it is
failed for below reason:
MigrationPreCheckError: Migration pre-check error: Unable to migrate
a34f9b88-1e07-4798-af46-ca3b3dbaceda to hchenos2: Lack of
memory(host:336 <= instance:512)
1 . My OpenStack cluster information :
1). There are two compute nodes in my cluster, and i created 4
instance(1vcpu/512Mmemory) on these hosts
-----------
mysql> select
hypervisor_hostname,vcpus,vcpus_used,running_vms,memory_mb,memory_mb_used,free_ram_mb,deleted
from compute_nodes where deleted=0;
+----------------------------------+-------+------------+-------------+-----------+----------------+-------------+---------+
| hypervisor_hostname | vcpus | vcpus_used | running_vms |
memory_mb | memory_mb_used | free_ram_mb | deleted |
+----------------------------------+-------+------------+-------------+-----------+----------------+-------------+---------+
| hchenos1.eng.platformlab.ibm.com | 2 | 2 | 2 |
1872 | 1536 | 336 | 0 |
| hchenos2.eng.platformlab.ibm.com | 2 | 2 | 2 |
1872 | 1536 | 336 | 0 |
+----------------------------------+-------+------------+-------------+-----------+----------------+-------------+---------+
2 rows in set (0.00 sec)
mysql>
------------------------
[root@hchenos ~]# nova list
+--------------------------------------+------+--------+----------+
| ID | Name | Status | Networks |
+--------------------------------------+------+--------+----------+
| a34f9b88-1e07-4798-af46-ca3b3dbaceda | vm1 | ACTIVE | | >>>
on host 'hchenos1'
| f6aaeff9-2220-4693-8e5a-710f4c52b774 | vm2 | ACTIVE | |
>>>> on host 'hchenos2'
| bbee57a2-81cd-4933-a943-1c2272f7f550 | vm4 | ACTIVE | | >>>>
on host 'hchenos1'
| 74fe26ec-919c-4fa7-890f-f59abe09ef4f | vm5 | ACTIVE | |
>>>> on host 'hchenos2'
+--------------------------------------+------+--------+----------+
[root@hchenos ~]#
2). I also enable the ComputeFilter,RamFilter and CoreFilter in
nova.conf, but don't config over commit ratio for both vcpu and
memory, so the default ratio will be used.
2. In the above conditions, live migrate instance vm1 to hchenos2
failed:
[root@hchenos ~]# nova live-migration vm1 hchenos2
ERROR: Live migration of instance a34f9b88-1e07-4798-af46-ca3b3dbaceda to
host hchenos2 failed (HTTP 400) (Request-ID:
req-68244b99-e438-4000-8bdb-cc43b275c018)
conductor log:
...
ckages/nova/conductor/tasks/live_migrate.py", line 87, in
_check_requested_destination\n
self._check_destination_has_enough_memory()\n\n File
"/usr/lib/python2.6/site-packages/nova/conductor/tasks/live_migrate.py", line
108, in _check_destination_has_enough_memory\n
mem_inst=mem_inst))\n\nMigrationPreCheckError: Migration pre-check error:
Unable to migrate a34f9b88-1e07-4798-af46-ca3b3dbaceda to hchenos2: Lack of
memory(host:336 <= instance:512)\n\n']
I think the reason for above as below:
the free_ram_mb for 'hchenos2 ' is 336M, the request memory is 512M,
so the operation is failed.
free_ram_mb = memory_mb (1872) - 512(reserved_host_memory_mb) -
2*512(instance consume) = 336
3. But successfully boot an instance on 'hchenos2'
[root@hchenos ~]# nova boot --image cirros-0.3.0-x86_64 --flavor 1
--availability-zone nova:hchenos2 xhu
[root@hchenos ~]# nova list
+--------------------------------------+------+--------+----------+
| ID | Name | Status | Networks |
+--------------------------------------+------+--------+----------+
| a34f9b88-1e07-4798-af46-ca3b3dbaceda | vm1 | ACTIVE | |
| f6aaeff9-2220-4693-8e5a-710f4c52b774 | vm2 | ACTIVE | |
| bbee57a2-81cd-4933-a943-1c2272f7f550 | vm4 | ACTIVE | |
| 74fe26ec-919c-4fa7-890f-f59abe09ef4f | vm5 | ACTIVE | |
| 364d1a01-67ed-4966-bbfd-d21b6bc3067c | xhu | ACTIVE | | >>>>
is active
+--------------------------------------+------+--------+----------+
[root@hchenos ~]#
mysql> select
hypervisor_hostname,vcpus,vcpus_used,running_vms,memory_mb,memory_mb_used,free_ram_mb,deleted
from compute_nodes where deleted=0;
+----------------------------------+-------+------------+-------------+-----------+----------------+-------------+---------+
| hypervisor_hostname | vcpus | vcpus_used | running_vms |
memory_mb | memory_mb_used | free_ram_mb | deleted |
+----------------------------------+-------+------------+-------------+-----------+----------------+-------------+---------+
| hchenos1.eng.platformlab.ibm.com | 2 | 2 | 2 |
1872 | 1536 | 336 | 0 |
| hchenos2.eng.platformlab.ibm.com | 2 | 3 | 3 |
1872 | 2048 | -176 | 0 |
+----------------------------------+-------+------------+-------------+-----------+----------------+-------------+---------+
2 rows in set (0.00 sec)
mysql>
So, I'm very confused for above test result, why boot an instance is
OK on 'hchenos2', but live migration an instance to this host failed
due to "not enough memory" ?
After carefully go through NOVA source code (live_migrate.py:
execute()) , i think below will cause this issue:
1). The function '_check_destination_has_enough_memory' doesn't
consider the ram allocation ratio(default value is 1.5) when calculate
host free memory('free_ram_mb'), it is inconsistent with 'RamFilter'
for memory check when boot instance.
I think the free memory of host 'hchenos2' should be:
free_ram_mb = memory_mb (1872) * ram_allocation_ratio(1.5) -
memory_mb_used('1536') = 1272
2) why not check vcpu for live migration target host, only check
memory is enough?
live_migrate.py: execute
self._check_instance_is_running()
self._check_host_is_up(self.source)
if not self.destination:
self.destination = self._find_destination()
else:
self._check_requested_destination() >>>>
def _check_requested_destination(self):
self._check_destination_is_not_source()
self._check_host_is_up(self.destination)
self._check_destination_has_enough_memory() >>>> Only
check memory, why not check vcpu together?
self._check_compatible_with_source_hypervisor(self.destination)
self._call_livem_checks_on_host(self.destination)
3) The VM status need to be considering as well, for example, if the
instance is off, it doesn't consume compute node resource anymore on KVM
platform(is different form IBM PowerVM), but in
resource_tracker.py:_update_usage_from_instances() , only instance 'deleted'
flag
is taken into account when calculate resource usage.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1214943/+subscriptions
--
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : [email protected]
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help : https://help.launchpad.net/ListHelp