Public bug reported:

Description
===========

Using devstack Rocky with a NVIDIA Tesla M10 + GRID driver on RHEL 7.5.
Profile used in nova: nvidia-35 (num_heads=2, frl_config=45, framebuffer=512M, 
max_resolution=2560x1600, max_instance=16)

I can launch instances one by one without any issue.
I cannot use --max paramater greater than 1.

Expected result
===============

Be able to use --max parameter with vGPU

Steps to reproduce
==================

[root@host2 ~]# openstack server list
+--------------------------------------+-----------+--------+---------------------------------------------------------------------+--------+--------+
| ID                                   | Name      | Status | Networks          
                                                  | Image  | Flavor |
+--------------------------------------+-----------+--------+---------------------------------------------------------------------+--------+--------+
| 56aeda96-f193-49fc-914d-8b507674eb16 | instance0 | ACTIVE | 
private=fda2:f16f:605e:0:f816:3eff:fef2:8e20, 10.0.0.12, 172.24.4.2 | rhel75 | 
vgpu   |
+--------------------------------------+-----------+--------+---------------------------------------------------------------------+--------+--------+

[root@host2 ~]# openstack server create --flavor vgpu --image rhel75 --key-name 
myself --max 2 instance
+-------------------------------------+-----------------------------------------------+
| Field                               | Value                                   
      |
+-------------------------------------+-----------------------------------------------+
| OS-DCF:diskConfig                   | MANUAL                                  
      |
| OS-EXT-AZ:availability_zone         |                                         
      |
| OS-EXT-SRV-ATTR:host                | None                                    
      |
| OS-EXT-SRV-ATTR:hypervisor_hostname | None                                    
      |
| OS-EXT-SRV-ATTR:instance_name       |                                         
      |
| OS-EXT-STS:power_state              | NOSTATE                                 
      |
| OS-EXT-STS:task_state               | scheduling                              
      |
| OS-EXT-STS:vm_state                 | building                                
      |
| OS-SRV-USG:launched_at              | None                                    
      |
| OS-SRV-USG:terminated_at            | None                                    
      |
| accessIPv4                          |                                         
      |
| accessIPv6                          |                                         
      |
| addresses                           |                                         
      |
| adminPass                           | iNiFmD6kNszw                            
      |
| config_drive                        |                                         
      |
| created                             | 2018-07-05T09:19:25Z                    
      |
| flavor                              | vgpu (vgpu1)                            
      |
| hostId                              |                                         
      |
| id                                  | 5a8691a8-a18c-4c71-8541-be00f224fd82    
      |
| image                               | rhel75 
(e63a49a8-4568-4b57-9d12-1eb1ede28438) |
| key_name                            | myself                                  
      |
| name                                | instance-1                              
      |
| progress                            | 0                                       
      |
| project_id                          | fdea2c781db74ae593c5e9501e9290cc        
      |
| properties                          |                                         
      |
| security_groups                     | name='default'                          
      |
| status                              | BUILD                                   
      |
| updated                             | 2018-07-05T09:19:25Z                    
      |
| user_id                             | 130a646fc362418f8b62ac11f1154942        
      |
| volumes_attached                    |                                         
      |
+-------------------------------------+-----------------------------------------------+

[root@host2 ~]# openstack server list
+--------------------------------------+------------+--------+---------------------------------------------------------------------+--------+--------+
| ID                                   | Name       | Status | Networks         
                                                   | Image  | Flavor |
+--------------------------------------+------------+--------+---------------------------------------------------------------------+--------+--------+
| 515f0d21-6ab8-406e-9889-177718c79e61 | instance-2 | ERROR  |                  
                                                   | rhel75 | vgpu   |
| 5a8691a8-a18c-4c71-8541-be00f224fd82 | instance-1 | ACTIVE | 
private=fda2:f16f:605e:0:f816:3eff:fe1f:d7a, 10.0.0.11              | rhel75 | 
vgpu   |
| 56aeda96-f193-49fc-914d-8b507674eb16 | instance0  | ACTIVE | 
private=fda2:f16f:605e:0:f816:3eff:fef2:8e20, 10.0.0.12, 172.24.4.2 | rhel75 | 
vgpu   |
+--------------------------------------+------------+--------+---------------------------------------------------------------------+--------+--------+

[root@host2 ~]# openstack server create --flavor vgpu --image rhel75 --key-name 
myself --max 1 instance
+-------------------------------------+-----------------------------------------------+
| Field                               | Value                                   
      |
+-------------------------------------+-----------------------------------------------+
| OS-DCF:diskConfig                   | MANUAL                                  
      |
| OS-EXT-AZ:availability_zone         |                                         
      |
| OS-EXT-SRV-ATTR:host                | None                                    
      |
| OS-EXT-SRV-ATTR:hypervisor_hostname | None                                    
      |
| OS-EXT-SRV-ATTR:instance_name       |                                         
      |
| OS-EXT-STS:power_state              | NOSTATE                                 
      |
| OS-EXT-STS:task_state               | scheduling                              
      |
| OS-EXT-STS:vm_state                 | building                                
      |
| OS-SRV-USG:launched_at              | None                                    
      |
| OS-SRV-USG:terminated_at            | None                                    
      |
| accessIPv4                          |                                         
      |
| accessIPv6                          |                                         
      |
| addresses                           |                                         
      |
| adminPass                           | MGxmntECb22S                            
      |
| config_drive                        |                                         
      |
| created                             | 2018-07-05T09:19:45Z                    
      |
| flavor                              | vgpu (vgpu1)                            
      |
| hostId                              |                                         
      |
| id                                  | 24df940f-500b-44db-88e2-a6fd1fe915c0    
      |
| image                               | rhel75 
(e63a49a8-4568-4b57-9d12-1eb1ede28438) |
| key_name                            | myself                                  
      |
| name                                | instance                                
      |
| progress                            | 0                                       
      |
| project_id                          | fdea2c781db74ae593c5e9501e9290cc        
      |
| properties                          |                                         
      |
| security_groups                     | name='default'                          
      |
| status                              | BUILD                                   
      |
| updated                             | 2018-07-05T09:19:45Z                    
      |
| user_id                             | 130a646fc362418f8b62ac11f1154942        
      |
| volumes_attached                    |                                         
      |
+-------------------------------------+-----------------------------------------------+

[root@host2 ~]# openstack server list
+--------------------------------------+------------+--------+---------------------------------------------------------------------+--------+--------+
| ID                                   | Name       | Status | Networks         
                                                   | Image  | Flavor |
+--------------------------------------+------------+--------+---------------------------------------------------------------------+--------+--------+
| 24df940f-500b-44db-88e2-a6fd1fe915c0 | instance   | BUILD  | 
private=fda2:f16f:605e:0:f816:3eff:fefd:8796, 10.0.0.7              | rhel75 | 
vgpu   |
| 515f0d21-6ab8-406e-9889-177718c79e61 | instance-2 | ERROR  |                  
                                                   | rhel75 | vgpu   |
| 5a8691a8-a18c-4c71-8541-be00f224fd82 | instance-1 | ACTIVE | 
private=fda2:f16f:605e:0:f816:3eff:fe1f:d7a, 10.0.0.11              | rhel75 | 
vgpu   |
| 56aeda96-f193-49fc-914d-8b507674eb16 | instance0  | ACTIVE | 
private=fda2:f16f:605e:0:f816:3eff:fef2:8e20, 10.0.0.12, 172.24.4.2 | rhel75 | 
vgpu   |
+--------------------------------------+------------+--------+---------------------------------------------------------------------+--------+--------+

[root@host2 ~]# openstack server list
+--------------------------------------+------------+--------+---------------------------------------------------------------------+--------+--------+
| ID                                   | Name       | Status | Networks         
                                                   | Image  | Flavor |
+--------------------------------------+------------+--------+---------------------------------------------------------------------+--------+--------+
| 24df940f-500b-44db-88e2-a6fd1fe915c0 | instance   | ACTIVE | 
private=fda2:f16f:605e:0:f816:3eff:fefd:8796, 10.0.0.7              | rhel75 | 
vgpu   |
| 515f0d21-6ab8-406e-9889-177718c79e61 | instance-2 | ERROR  |                  
                                                   | rhel75 | vgpu   |
| 5a8691a8-a18c-4c71-8541-be00f224fd82 | instance-1 | ACTIVE | 
private=fda2:f16f:605e:0:f816:3eff:fe1f:d7a, 10.0.0.11              | rhel75 | 
vgpu   |
| 56aeda96-f193-49fc-914d-8b507674eb16 | instance0  | ACTIVE | 
private=fda2:f16f:605e:0:f816:3eff:fef2:8e20, 10.0.0.12, 172.24.4.2 | rhel75 | 
vgpu   |
+--------------------------------------+------------+--------+---------------------------------------------------------------------+--------+--------+

[root@host2 ~]# openstack server create --flavor vgpu --image rhel75 --key-name 
myself --max 1 instance
+-------------------------------------+-----------------------------------------------+
| Field                               | Value                                   
      |
+-------------------------------------+-----------------------------------------------+
| OS-DCF:diskConfig                   | MANUAL                                  
      |
| OS-EXT-AZ:availability_zone         |                                         
      |
| OS-EXT-SRV-ATTR:host                | None                                    
      |
| OS-EXT-SRV-ATTR:hypervisor_hostname | None                                    
      |
| OS-EXT-SRV-ATTR:instance_name       |                                         
      |
| OS-EXT-STS:power_state              | NOSTATE                                 
      |
| OS-EXT-STS:task_state               | scheduling                              
      |
| OS-EXT-STS:vm_state                 | building                                
      |
| OS-SRV-USG:launched_at              | None                                    
      |
| OS-SRV-USG:terminated_at            | None                                    
      |
| accessIPv4                          |                                         
      |
| accessIPv6                          |                                         
      |
| addresses                           |                                         
      |
| adminPass                           | 69crZEFxBT9j                            
      |
| config_drive                        |                                         
      |
| created                             | 2018-07-05T09:21:43Z                    
      |
| flavor                              | vgpu (vgpu1)                            
      |
| hostId                              |                                         
      |
| id                                  | 4a172549-91c2-46cc-8895-cd2fcbb19430    
      |
| image                               | rhel75 
(e63a49a8-4568-4b57-9d12-1eb1ede28438) |
| key_name                            | myself                                  
      |
| name                                | instance                                
      |
| progress                            | 0                                       
      |
| project_id                          | fdea2c781db74ae593c5e9501e9290cc        
      |
| properties                          |                                         
      |
| security_groups                     | name='default'                          
      |
| status                              | BUILD                                   
      |
| updated                             | 2018-07-05T09:21:43Z                    
      |
| user_id                             | 130a646fc362418f8b62ac11f1154942        
      |
| volumes_attached                    |                                         
      |
+-------------------------------------+-----------------------------------------------+

[root@host2 ~]# openstack server list
+--------------------------------------+------------+--------+---------------------------------------------------------------------+--------+--------+
| ID                                   | Name       | Status | Networks         
                                                   | Image  | Flavor |
+--------------------------------------+------------+--------+---------------------------------------------------------------------+--------+--------+
| 4a172549-91c2-46cc-8895-cd2fcbb19430 | instance   | BUILD  |                  
                                                   | rhel75 | vgpu   |
| 24df940f-500b-44db-88e2-a6fd1fe915c0 | instance   | ACTIVE | 
private=fda2:f16f:605e:0:f816:3eff:fefd:8796, 10.0.0.7              | rhel75 | 
vgpu   |
| 515f0d21-6ab8-406e-9889-177718c79e61 | instance-2 | ERROR  |                  
                                                   | rhel75 | vgpu   |
| 5a8691a8-a18c-4c71-8541-be00f224fd82 | instance-1 | ACTIVE | 
private=fda2:f16f:605e:0:f816:3eff:fe1f:d7a, 10.0.0.11              | rhel75 | 
vgpu   |
| 56aeda96-f193-49fc-914d-8b507674eb16 | instance0  | ACTIVE | 
private=fda2:f16f:605e:0:f816:3eff:fef2:8e20, 10.0.0.12, 172.24.4.2 | rhel75 | 
vgpu   |
+--------------------------------------+------------+--------+---------------------------------------------------------------------+--------+--------+

[root@host2 ~]# openstack server list
+--------------------------------------+------------+--------+---------------------------------------------------------------------+--------+--------+
| ID                                   | Name       | Status | Networks         
                                                   | Image  | Flavor |
+--------------------------------------+------------+--------+---------------------------------------------------------------------+--------+--------+
| 4a172549-91c2-46cc-8895-cd2fcbb19430 | instance   | ACTIVE | 
private=fda2:f16f:605e:0:f816:3eff:fe7d:a6d8, 10.0.0.4              | rhel75 | 
vgpu   |
| 24df940f-500b-44db-88e2-a6fd1fe915c0 | instance   | ACTIVE | 
private=fda2:f16f:605e:0:f816:3eff:fefd:8796, 10.0.0.7              | rhel75 | 
vgpu   |
| 515f0d21-6ab8-406e-9889-177718c79e61 | instance-2 | ERROR  |                  
                                                   | rhel75 | vgpu   |
| 5a8691a8-a18c-4c71-8541-be00f224fd82 | instance-1 | ACTIVE | 
private=fda2:f16f:605e:0:f816:3eff:fe1f:d7a, 10.0.0.11              | rhel75 | 
vgpu   |
| 56aeda96-f193-49fc-914d-8b507674eb16 | instance0  | ACTIVE | 
private=fda2:f16f:605e:0:f816:3eff:fef2:8e20, 10.0.0.12, 172.24.4.2 | rhel75 | 
vgpu   |
+--------------------------------------+------------+--------+---------------------------------------------------------------------+--------+--------+

- Nova error:
{u'message': u'Exceeded maximum number of retries. Exhausted all hosts 
available for retrying build failures for instance 
de2a5078-6acd-4ffd-9895-d664adb42296.', u'code': 500, u'details': u'  File 
"/opt/stack/nova/nova/conductor/manager.py", line 579, in build_instances\n    
raise exception.MaxRetriesExceeded(reason=msg)\n', u'created': 
u'2018-07-05T07:32:52Z'} |

- Libvirt error:
messages:Jul  5 03:32:51 host2 nova-compute: #033[00m: libvirtError: Requested 
operation is not valid: mediated device 
/sys/bus/mdev/devices/25f56195-9719-4380-a90b-084d64307e06 is in use by driver 
QEMU, domain instance-00000019
messages:Jul  5 03:32:51 host2 nova-compute: #033[01;31mERROR 
nova.virt.libvirt.driver [#033[01;36mNone 
req-e04582ed-de22-4bfa-9253-92e687328a4c #033[00;36mservice nova#033[01;31m] 
#033[01;35m[instance: de2a5078-6acd-4ffd-9895-d664adb42296] #033[01;31mFailed 
to start libvirt guest#033[00m: libvirtError: Requested operation is not valid: 
mediated device /sys/bus/mdev/devices/25f56195-9719-4380-a90b-084d64307e06 is 
in use by driver QEMU, domain instance-00000019

** Affects: nova
     Importance: Undecided
         Status: New


** Tags: vgpu

** Description changed:

  Description
  ===========
  
  Using devstack Rocky with a NVIDIA Tesla M10 + GRID driver on RHEL 7.5.
  Profile used in nova: nvidia-35 (num_heads=2, frl_config=45, 
framebuffer=512M, max_resolution=2560x1600, max_instance=16)
  
  I can launch instances one by one without any issue.
  I cannot use --max paramater greater than 1.
  
  Expected result
  ===============
  
- Be able to use --max parameter
+ Be able to use --max parameter with vGPU
  
  Steps to reproduce
  ==================
  
  [root@host2 ~]# openstack server list
  
+--------------------------------------+-----------+--------+---------------------------------------------------------------------+--------+--------+
  | ID                                   | Name      | Status | Networks        
                                                    | Image  | Flavor |
  
+--------------------------------------+-----------+--------+---------------------------------------------------------------------+--------+--------+
  | 56aeda96-f193-49fc-914d-8b507674eb16 | instance0 | ACTIVE | 
private=fda2:f16f:605e:0:f816:3eff:fef2:8e20, 10.0.0.12, 172.24.4.2 | rhel75 | 
vgpu   |
  
+--------------------------------------+-----------+--------+---------------------------------------------------------------------+--------+--------+
  
  [root@host2 ~]# openstack server create --flavor vgpu --image rhel75 
--key-name myself --max 2 instance
  
+-------------------------------------+-----------------------------------------------+
  | Field                               | Value                                 
        |
  
+-------------------------------------+-----------------------------------------------+
  | OS-DCF:diskConfig                   | MANUAL                                
        |
  | OS-EXT-AZ:availability_zone         |                                       
        |
  | OS-EXT-SRV-ATTR:host                | None                                  
        |
  | OS-EXT-SRV-ATTR:hypervisor_hostname | None                                  
        |
  | OS-EXT-SRV-ATTR:instance_name       |                                       
        |
  | OS-EXT-STS:power_state              | NOSTATE                               
        |
  | OS-EXT-STS:task_state               | scheduling                            
        |
  | OS-EXT-STS:vm_state                 | building                              
        |
  | OS-SRV-USG:launched_at              | None                                  
        |
  | OS-SRV-USG:terminated_at            | None                                  
        |
  | accessIPv4                          |                                       
        |
  | accessIPv6                          |                                       
        |
  | addresses                           |                                       
        |
  | adminPass                           | iNiFmD6kNszw                          
        |
  | config_drive                        |                                       
        |
  | created                             | 2018-07-05T09:19:25Z                  
        |
  | flavor                              | vgpu (vgpu1)                          
        |
  | hostId                              |                                       
        |
  | id                                  | 5a8691a8-a18c-4c71-8541-be00f224fd82  
        |
  | image                               | rhel75 
(e63a49a8-4568-4b57-9d12-1eb1ede28438) |
  | key_name                            | myself                                
        |
  | name                                | instance-1                            
        |
  | progress                            | 0                                     
        |
  | project_id                          | fdea2c781db74ae593c5e9501e9290cc      
        |
  | properties                          |                                       
        |
  | security_groups                     | name='default'                        
        |
  | status                              | BUILD                                 
        |
  | updated                             | 2018-07-05T09:19:25Z                  
        |
  | user_id                             | 130a646fc362418f8b62ac11f1154942      
        |
  | volumes_attached                    |                                       
        |
  
+-------------------------------------+-----------------------------------------------+
  
  [root@host2 ~]# openstack server list
  
+--------------------------------------+------------+--------+---------------------------------------------------------------------+--------+--------+
  | ID                                   | Name       | Status | Networks       
                                                     | Image  | Flavor |
  
+--------------------------------------+------------+--------+---------------------------------------------------------------------+--------+--------+
  | 515f0d21-6ab8-406e-9889-177718c79e61 | instance-2 | ERROR  |                
                                                     | rhel75 | vgpu   |
  | 5a8691a8-a18c-4c71-8541-be00f224fd82 | instance-1 | ACTIVE | 
private=fda2:f16f:605e:0:f816:3eff:fe1f:d7a, 10.0.0.11              | rhel75 | 
vgpu   |
  | 56aeda96-f193-49fc-914d-8b507674eb16 | instance0  | ACTIVE | 
private=fda2:f16f:605e:0:f816:3eff:fef2:8e20, 10.0.0.12, 172.24.4.2 | rhel75 | 
vgpu   |
  
+--------------------------------------+------------+--------+---------------------------------------------------------------------+--------+--------+
  
  [root@host2 ~]# openstack server create --flavor vgpu --image rhel75 
--key-name myself --max 1 instance
  
+-------------------------------------+-----------------------------------------------+
  | Field                               | Value                                 
        |
  
+-------------------------------------+-----------------------------------------------+
  | OS-DCF:diskConfig                   | MANUAL                                
        |
  | OS-EXT-AZ:availability_zone         |                                       
        |
  | OS-EXT-SRV-ATTR:host                | None                                  
        |
  | OS-EXT-SRV-ATTR:hypervisor_hostname | None                                  
        |
  | OS-EXT-SRV-ATTR:instance_name       |                                       
        |
  | OS-EXT-STS:power_state              | NOSTATE                               
        |
  | OS-EXT-STS:task_state               | scheduling                            
        |
  | OS-EXT-STS:vm_state                 | building                              
        |
  | OS-SRV-USG:launched_at              | None                                  
        |
  | OS-SRV-USG:terminated_at            | None                                  
        |
  | accessIPv4                          |                                       
        |
  | accessIPv6                          |                                       
        |
  | addresses                           |                                       
        |
  | adminPass                           | MGxmntECb22S                          
        |
  | config_drive                        |                                       
        |
  | created                             | 2018-07-05T09:19:45Z                  
        |
  | flavor                              | vgpu (vgpu1)                          
        |
  | hostId                              |                                       
        |
  | id                                  | 24df940f-500b-44db-88e2-a6fd1fe915c0  
        |
  | image                               | rhel75 
(e63a49a8-4568-4b57-9d12-1eb1ede28438) |
  | key_name                            | myself                                
        |
  | name                                | instance                              
        |
  | progress                            | 0                                     
        |
  | project_id                          | fdea2c781db74ae593c5e9501e9290cc      
        |
  | properties                          |                                       
        |
  | security_groups                     | name='default'                        
        |
  | status                              | BUILD                                 
        |
  | updated                             | 2018-07-05T09:19:45Z                  
        |
  | user_id                             | 130a646fc362418f8b62ac11f1154942      
        |
  | volumes_attached                    |                                       
        |
  
+-------------------------------------+-----------------------------------------------+
  
  [root@host2 ~]# openstack server list
  
+--------------------------------------+------------+--------+---------------------------------------------------------------------+--------+--------+
  | ID                                   | Name       | Status | Networks       
                                                     | Image  | Flavor |
  
+--------------------------------------+------------+--------+---------------------------------------------------------------------+--------+--------+
  | 24df940f-500b-44db-88e2-a6fd1fe915c0 | instance   | BUILD  | 
private=fda2:f16f:605e:0:f816:3eff:fefd:8796, 10.0.0.7              | rhel75 | 
vgpu   |
  | 515f0d21-6ab8-406e-9889-177718c79e61 | instance-2 | ERROR  |                
                                                     | rhel75 | vgpu   |
  | 5a8691a8-a18c-4c71-8541-be00f224fd82 | instance-1 | ACTIVE | 
private=fda2:f16f:605e:0:f816:3eff:fe1f:d7a, 10.0.0.11              | rhel75 | 
vgpu   |
  | 56aeda96-f193-49fc-914d-8b507674eb16 | instance0  | ACTIVE | 
private=fda2:f16f:605e:0:f816:3eff:fef2:8e20, 10.0.0.12, 172.24.4.2 | rhel75 | 
vgpu   |
  
+--------------------------------------+------------+--------+---------------------------------------------------------------------+--------+--------+
  
  [root@host2 ~]# openstack server list
  
+--------------------------------------+------------+--------+---------------------------------------------------------------------+--------+--------+
  | ID                                   | Name       | Status | Networks       
                                                     | Image  | Flavor |
  
+--------------------------------------+------------+--------+---------------------------------------------------------------------+--------+--------+
  | 24df940f-500b-44db-88e2-a6fd1fe915c0 | instance   | ACTIVE | 
private=fda2:f16f:605e:0:f816:3eff:fefd:8796, 10.0.0.7              | rhel75 | 
vgpu   |
  | 515f0d21-6ab8-406e-9889-177718c79e61 | instance-2 | ERROR  |                
                                                     | rhel75 | vgpu   |
  | 5a8691a8-a18c-4c71-8541-be00f224fd82 | instance-1 | ACTIVE | 
private=fda2:f16f:605e:0:f816:3eff:fe1f:d7a, 10.0.0.11              | rhel75 | 
vgpu   |
  | 56aeda96-f193-49fc-914d-8b507674eb16 | instance0  | ACTIVE | 
private=fda2:f16f:605e:0:f816:3eff:fef2:8e20, 10.0.0.12, 172.24.4.2 | rhel75 | 
vgpu   |
  
+--------------------------------------+------------+--------+---------------------------------------------------------------------+--------+--------+
  
  [root@host2 ~]# openstack server create --flavor vgpu --image rhel75 
--key-name myself --max 1 instance
  
+-------------------------------------+-----------------------------------------------+
  | Field                               | Value                                 
        |
  
+-------------------------------------+-----------------------------------------------+
  | OS-DCF:diskConfig                   | MANUAL                                
        |
  | OS-EXT-AZ:availability_zone         |                                       
        |
  | OS-EXT-SRV-ATTR:host                | None                                  
        |
  | OS-EXT-SRV-ATTR:hypervisor_hostname | None                                  
        |
  | OS-EXT-SRV-ATTR:instance_name       |                                       
        |
  | OS-EXT-STS:power_state              | NOSTATE                               
        |
  | OS-EXT-STS:task_state               | scheduling                            
        |
  | OS-EXT-STS:vm_state                 | building                              
        |
  | OS-SRV-USG:launched_at              | None                                  
        |
  | OS-SRV-USG:terminated_at            | None                                  
        |
  | accessIPv4                          |                                       
        |
  | accessIPv6                          |                                       
        |
  | addresses                           |                                       
        |
  | adminPass                           | 69crZEFxBT9j                          
        |
  | config_drive                        |                                       
        |
  | created                             | 2018-07-05T09:21:43Z                  
        |
  | flavor                              | vgpu (vgpu1)                          
        |
  | hostId                              |                                       
        |
  | id                                  | 4a172549-91c2-46cc-8895-cd2fcbb19430  
        |
  | image                               | rhel75 
(e63a49a8-4568-4b57-9d12-1eb1ede28438) |
  | key_name                            | myself                                
        |
  | name                                | instance                              
        |
  | progress                            | 0                                     
        |
  | project_id                          | fdea2c781db74ae593c5e9501e9290cc      
        |
  | properties                          |                                       
        |
  | security_groups                     | name='default'                        
        |
  | status                              | BUILD                                 
        |
  | updated                             | 2018-07-05T09:21:43Z                  
        |
  | user_id                             | 130a646fc362418f8b62ac11f1154942      
        |
  | volumes_attached                    |                                       
        |
  
+-------------------------------------+-----------------------------------------------+
  
  [root@host2 ~]# openstack server list
  
+--------------------------------------+------------+--------+---------------------------------------------------------------------+--------+--------+
  | ID                                   | Name       | Status | Networks       
                                                     | Image  | Flavor |
  
+--------------------------------------+------------+--------+---------------------------------------------------------------------+--------+--------+
  | 4a172549-91c2-46cc-8895-cd2fcbb19430 | instance   | BUILD  |                
                                                     | rhel75 | vgpu   |
  | 24df940f-500b-44db-88e2-a6fd1fe915c0 | instance   | ACTIVE | 
private=fda2:f16f:605e:0:f816:3eff:fefd:8796, 10.0.0.7              | rhel75 | 
vgpu   |
  | 515f0d21-6ab8-406e-9889-177718c79e61 | instance-2 | ERROR  |                
                                                     | rhel75 | vgpu   |
  | 5a8691a8-a18c-4c71-8541-be00f224fd82 | instance-1 | ACTIVE | 
private=fda2:f16f:605e:0:f816:3eff:fe1f:d7a, 10.0.0.11              | rhel75 | 
vgpu   |
  | 56aeda96-f193-49fc-914d-8b507674eb16 | instance0  | ACTIVE | 
private=fda2:f16f:605e:0:f816:3eff:fef2:8e20, 10.0.0.12, 172.24.4.2 | rhel75 | 
vgpu   |
  
+--------------------------------------+------------+--------+---------------------------------------------------------------------+--------+--------+
  
  [root@host2 ~]# openstack server list
  
+--------------------------------------+------------+--------+---------------------------------------------------------------------+--------+--------+
  | ID                                   | Name       | Status | Networks       
                                                     | Image  | Flavor |
  
+--------------------------------------+------------+--------+---------------------------------------------------------------------+--------+--------+
  | 4a172549-91c2-46cc-8895-cd2fcbb19430 | instance   | ACTIVE | 
private=fda2:f16f:605e:0:f816:3eff:fe7d:a6d8, 10.0.0.4              | rhel75 | 
vgpu   |
  | 24df940f-500b-44db-88e2-a6fd1fe915c0 | instance   | ACTIVE | 
private=fda2:f16f:605e:0:f816:3eff:fefd:8796, 10.0.0.7              | rhel75 | 
vgpu   |
  | 515f0d21-6ab8-406e-9889-177718c79e61 | instance-2 | ERROR  |                
                                                     | rhel75 | vgpu   |
  | 5a8691a8-a18c-4c71-8541-be00f224fd82 | instance-1 | ACTIVE | 
private=fda2:f16f:605e:0:f816:3eff:fe1f:d7a, 10.0.0.11              | rhel75 | 
vgpu   |
  | 56aeda96-f193-49fc-914d-8b507674eb16 | instance0  | ACTIVE | 
private=fda2:f16f:605e:0:f816:3eff:fef2:8e20, 10.0.0.12, 172.24.4.2 | rhel75 | 
vgpu   |
  
+--------------------------------------+------------+--------+---------------------------------------------------------------------+--------+--------+
  
  - Nova error:
  {u'message': u'Exceeded maximum number of retries. Exhausted all hosts 
available for retrying build failures for instance 
de2a5078-6acd-4ffd-9895-d664adb42296.', u'code': 500, u'details': u'  File 
"/opt/stack/nova/nova/conductor/manager.py", line 579, in build_instances\n    
raise exception.MaxRetriesExceeded(reason=msg)\n', u'created': 
u'2018-07-05T07:32:52Z'} |
  
  - Libvirt error:
  messages:Jul  5 03:32:51 host2 nova-compute: #033[00m: libvirtError: 
Requested operation is not valid: mediated device 
/sys/bus/mdev/devices/25f56195-9719-4380-a90b-084d64307e06 is in use by driver 
QEMU, domain instance-00000019
  messages:Jul  5 03:32:51 host2 nova-compute: #033[01;31mERROR 
nova.virt.libvirt.driver [#033[01;36mNone 
req-e04582ed-de22-4bfa-9253-92e687328a4c #033[00;36mservice nova#033[01;31m] 
#033[01;35m[instance: de2a5078-6acd-4ffd-9895-d664adb42296] #033[01;31mFailed 
to start libvirt guest#033[00m: libvirtError: Requested operation is not valid: 
mediated device /sys/bus/mdev/devices/25f56195-9719-4380-a90b-084d64307e06 is 
in use by driver QEMU, domain instance-00000019

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1780225

Title:
  Libvirt error when using --max > 1 with vGPU

Status in OpenStack Compute (nova):
  New

Bug description:
  Description
  ===========

  Using devstack Rocky with a NVIDIA Tesla M10 + GRID driver on RHEL 7.5.
  Profile used in nova: nvidia-35 (num_heads=2, frl_config=45, 
framebuffer=512M, max_resolution=2560x1600, max_instance=16)

  I can launch instances one by one without any issue.
  I cannot use --max paramater greater than 1.

  Expected result
  ===============

  Be able to use --max parameter with vGPU

  Steps to reproduce
  ==================

  [root@host2 ~]# openstack server list
  
+--------------------------------------+-----------+--------+---------------------------------------------------------------------+--------+--------+
  | ID                                   | Name      | Status | Networks        
                                                    | Image  | Flavor |
  
+--------------------------------------+-----------+--------+---------------------------------------------------------------------+--------+--------+
  | 56aeda96-f193-49fc-914d-8b507674eb16 | instance0 | ACTIVE | 
private=fda2:f16f:605e:0:f816:3eff:fef2:8e20, 10.0.0.12, 172.24.4.2 | rhel75 | 
vgpu   |
  
+--------------------------------------+-----------+--------+---------------------------------------------------------------------+--------+--------+

  [root@host2 ~]# openstack server create --flavor vgpu --image rhel75 
--key-name myself --max 2 instance
  
+-------------------------------------+-----------------------------------------------+
  | Field                               | Value                                 
        |
  
+-------------------------------------+-----------------------------------------------+
  | OS-DCF:diskConfig                   | MANUAL                                
        |
  | OS-EXT-AZ:availability_zone         |                                       
        |
  | OS-EXT-SRV-ATTR:host                | None                                  
        |
  | OS-EXT-SRV-ATTR:hypervisor_hostname | None                                  
        |
  | OS-EXT-SRV-ATTR:instance_name       |                                       
        |
  | OS-EXT-STS:power_state              | NOSTATE                               
        |
  | OS-EXT-STS:task_state               | scheduling                            
        |
  | OS-EXT-STS:vm_state                 | building                              
        |
  | OS-SRV-USG:launched_at              | None                                  
        |
  | OS-SRV-USG:terminated_at            | None                                  
        |
  | accessIPv4                          |                                       
        |
  | accessIPv6                          |                                       
        |
  | addresses                           |                                       
        |
  | adminPass                           | iNiFmD6kNszw                          
        |
  | config_drive                        |                                       
        |
  | created                             | 2018-07-05T09:19:25Z                  
        |
  | flavor                              | vgpu (vgpu1)                          
        |
  | hostId                              |                                       
        |
  | id                                  | 5a8691a8-a18c-4c71-8541-be00f224fd82  
        |
  | image                               | rhel75 
(e63a49a8-4568-4b57-9d12-1eb1ede28438) |
  | key_name                            | myself                                
        |
  | name                                | instance-1                            
        |
  | progress                            | 0                                     
        |
  | project_id                          | fdea2c781db74ae593c5e9501e9290cc      
        |
  | properties                          |                                       
        |
  | security_groups                     | name='default'                        
        |
  | status                              | BUILD                                 
        |
  | updated                             | 2018-07-05T09:19:25Z                  
        |
  | user_id                             | 130a646fc362418f8b62ac11f1154942      
        |
  | volumes_attached                    |                                       
        |
  
+-------------------------------------+-----------------------------------------------+

  [root@host2 ~]# openstack server list
  
+--------------------------------------+------------+--------+---------------------------------------------------------------------+--------+--------+
  | ID                                   | Name       | Status | Networks       
                                                     | Image  | Flavor |
  
+--------------------------------------+------------+--------+---------------------------------------------------------------------+--------+--------+
  | 515f0d21-6ab8-406e-9889-177718c79e61 | instance-2 | ERROR  |                
                                                     | rhel75 | vgpu   |
  | 5a8691a8-a18c-4c71-8541-be00f224fd82 | instance-1 | ACTIVE | 
private=fda2:f16f:605e:0:f816:3eff:fe1f:d7a, 10.0.0.11              | rhel75 | 
vgpu   |
  | 56aeda96-f193-49fc-914d-8b507674eb16 | instance0  | ACTIVE | 
private=fda2:f16f:605e:0:f816:3eff:fef2:8e20, 10.0.0.12, 172.24.4.2 | rhel75 | 
vgpu   |
  
+--------------------------------------+------------+--------+---------------------------------------------------------------------+--------+--------+

  [root@host2 ~]# openstack server create --flavor vgpu --image rhel75 
--key-name myself --max 1 instance
  
+-------------------------------------+-----------------------------------------------+
  | Field                               | Value                                 
        |
  
+-------------------------------------+-----------------------------------------------+
  | OS-DCF:diskConfig                   | MANUAL                                
        |
  | OS-EXT-AZ:availability_zone         |                                       
        |
  | OS-EXT-SRV-ATTR:host                | None                                  
        |
  | OS-EXT-SRV-ATTR:hypervisor_hostname | None                                  
        |
  | OS-EXT-SRV-ATTR:instance_name       |                                       
        |
  | OS-EXT-STS:power_state              | NOSTATE                               
        |
  | OS-EXT-STS:task_state               | scheduling                            
        |
  | OS-EXT-STS:vm_state                 | building                              
        |
  | OS-SRV-USG:launched_at              | None                                  
        |
  | OS-SRV-USG:terminated_at            | None                                  
        |
  | accessIPv4                          |                                       
        |
  | accessIPv6                          |                                       
        |
  | addresses                           |                                       
        |
  | adminPass                           | MGxmntECb22S                          
        |
  | config_drive                        |                                       
        |
  | created                             | 2018-07-05T09:19:45Z                  
        |
  | flavor                              | vgpu (vgpu1)                          
        |
  | hostId                              |                                       
        |
  | id                                  | 24df940f-500b-44db-88e2-a6fd1fe915c0  
        |
  | image                               | rhel75 
(e63a49a8-4568-4b57-9d12-1eb1ede28438) |
  | key_name                            | myself                                
        |
  | name                                | instance                              
        |
  | progress                            | 0                                     
        |
  | project_id                          | fdea2c781db74ae593c5e9501e9290cc      
        |
  | properties                          |                                       
        |
  | security_groups                     | name='default'                        
        |
  | status                              | BUILD                                 
        |
  | updated                             | 2018-07-05T09:19:45Z                  
        |
  | user_id                             | 130a646fc362418f8b62ac11f1154942      
        |
  | volumes_attached                    |                                       
        |
  
+-------------------------------------+-----------------------------------------------+

  [root@host2 ~]# openstack server list
  
+--------------------------------------+------------+--------+---------------------------------------------------------------------+--------+--------+
  | ID                                   | Name       | Status | Networks       
                                                     | Image  | Flavor |
  
+--------------------------------------+------------+--------+---------------------------------------------------------------------+--------+--------+
  | 24df940f-500b-44db-88e2-a6fd1fe915c0 | instance   | BUILD  | 
private=fda2:f16f:605e:0:f816:3eff:fefd:8796, 10.0.0.7              | rhel75 | 
vgpu   |
  | 515f0d21-6ab8-406e-9889-177718c79e61 | instance-2 | ERROR  |                
                                                     | rhel75 | vgpu   |
  | 5a8691a8-a18c-4c71-8541-be00f224fd82 | instance-1 | ACTIVE | 
private=fda2:f16f:605e:0:f816:3eff:fe1f:d7a, 10.0.0.11              | rhel75 | 
vgpu   |
  | 56aeda96-f193-49fc-914d-8b507674eb16 | instance0  | ACTIVE | 
private=fda2:f16f:605e:0:f816:3eff:fef2:8e20, 10.0.0.12, 172.24.4.2 | rhel75 | 
vgpu   |
  
+--------------------------------------+------------+--------+---------------------------------------------------------------------+--------+--------+

  [root@host2 ~]# openstack server list
  
+--------------------------------------+------------+--------+---------------------------------------------------------------------+--------+--------+
  | ID                                   | Name       | Status | Networks       
                                                     | Image  | Flavor |
  
+--------------------------------------+------------+--------+---------------------------------------------------------------------+--------+--------+
  | 24df940f-500b-44db-88e2-a6fd1fe915c0 | instance   | ACTIVE | 
private=fda2:f16f:605e:0:f816:3eff:fefd:8796, 10.0.0.7              | rhel75 | 
vgpu   |
  | 515f0d21-6ab8-406e-9889-177718c79e61 | instance-2 | ERROR  |                
                                                     | rhel75 | vgpu   |
  | 5a8691a8-a18c-4c71-8541-be00f224fd82 | instance-1 | ACTIVE | 
private=fda2:f16f:605e:0:f816:3eff:fe1f:d7a, 10.0.0.11              | rhel75 | 
vgpu   |
  | 56aeda96-f193-49fc-914d-8b507674eb16 | instance0  | ACTIVE | 
private=fda2:f16f:605e:0:f816:3eff:fef2:8e20, 10.0.0.12, 172.24.4.2 | rhel75 | 
vgpu   |
  
+--------------------------------------+------------+--------+---------------------------------------------------------------------+--------+--------+

  [root@host2 ~]# openstack server create --flavor vgpu --image rhel75 
--key-name myself --max 1 instance
  
+-------------------------------------+-----------------------------------------------+
  | Field                               | Value                                 
        |
  
+-------------------------------------+-----------------------------------------------+
  | OS-DCF:diskConfig                   | MANUAL                                
        |
  | OS-EXT-AZ:availability_zone         |                                       
        |
  | OS-EXT-SRV-ATTR:host                | None                                  
        |
  | OS-EXT-SRV-ATTR:hypervisor_hostname | None                                  
        |
  | OS-EXT-SRV-ATTR:instance_name       |                                       
        |
  | OS-EXT-STS:power_state              | NOSTATE                               
        |
  | OS-EXT-STS:task_state               | scheduling                            
        |
  | OS-EXT-STS:vm_state                 | building                              
        |
  | OS-SRV-USG:launched_at              | None                                  
        |
  | OS-SRV-USG:terminated_at            | None                                  
        |
  | accessIPv4                          |                                       
        |
  | accessIPv6                          |                                       
        |
  | addresses                           |                                       
        |
  | adminPass                           | 69crZEFxBT9j                          
        |
  | config_drive                        |                                       
        |
  | created                             | 2018-07-05T09:21:43Z                  
        |
  | flavor                              | vgpu (vgpu1)                          
        |
  | hostId                              |                                       
        |
  | id                                  | 4a172549-91c2-46cc-8895-cd2fcbb19430  
        |
  | image                               | rhel75 
(e63a49a8-4568-4b57-9d12-1eb1ede28438) |
  | key_name                            | myself                                
        |
  | name                                | instance                              
        |
  | progress                            | 0                                     
        |
  | project_id                          | fdea2c781db74ae593c5e9501e9290cc      
        |
  | properties                          |                                       
        |
  | security_groups                     | name='default'                        
        |
  | status                              | BUILD                                 
        |
  | updated                             | 2018-07-05T09:21:43Z                  
        |
  | user_id                             | 130a646fc362418f8b62ac11f1154942      
        |
  | volumes_attached                    |                                       
        |
  
+-------------------------------------+-----------------------------------------------+

  [root@host2 ~]# openstack server list
  
+--------------------------------------+------------+--------+---------------------------------------------------------------------+--------+--------+
  | ID                                   | Name       | Status | Networks       
                                                     | Image  | Flavor |
  
+--------------------------------------+------------+--------+---------------------------------------------------------------------+--------+--------+
  | 4a172549-91c2-46cc-8895-cd2fcbb19430 | instance   | BUILD  |                
                                                     | rhel75 | vgpu   |
  | 24df940f-500b-44db-88e2-a6fd1fe915c0 | instance   | ACTIVE | 
private=fda2:f16f:605e:0:f816:3eff:fefd:8796, 10.0.0.7              | rhel75 | 
vgpu   |
  | 515f0d21-6ab8-406e-9889-177718c79e61 | instance-2 | ERROR  |                
                                                     | rhel75 | vgpu   |
  | 5a8691a8-a18c-4c71-8541-be00f224fd82 | instance-1 | ACTIVE | 
private=fda2:f16f:605e:0:f816:3eff:fe1f:d7a, 10.0.0.11              | rhel75 | 
vgpu   |
  | 56aeda96-f193-49fc-914d-8b507674eb16 | instance0  | ACTIVE | 
private=fda2:f16f:605e:0:f816:3eff:fef2:8e20, 10.0.0.12, 172.24.4.2 | rhel75 | 
vgpu   |
  
+--------------------------------------+------------+--------+---------------------------------------------------------------------+--------+--------+

  [root@host2 ~]# openstack server list
  
+--------------------------------------+------------+--------+---------------------------------------------------------------------+--------+--------+
  | ID                                   | Name       | Status | Networks       
                                                     | Image  | Flavor |
  
+--------------------------------------+------------+--------+---------------------------------------------------------------------+--------+--------+
  | 4a172549-91c2-46cc-8895-cd2fcbb19430 | instance   | ACTIVE | 
private=fda2:f16f:605e:0:f816:3eff:fe7d:a6d8, 10.0.0.4              | rhel75 | 
vgpu   |
  | 24df940f-500b-44db-88e2-a6fd1fe915c0 | instance   | ACTIVE | 
private=fda2:f16f:605e:0:f816:3eff:fefd:8796, 10.0.0.7              | rhel75 | 
vgpu   |
  | 515f0d21-6ab8-406e-9889-177718c79e61 | instance-2 | ERROR  |                
                                                     | rhel75 | vgpu   |
  | 5a8691a8-a18c-4c71-8541-be00f224fd82 | instance-1 | ACTIVE | 
private=fda2:f16f:605e:0:f816:3eff:fe1f:d7a, 10.0.0.11              | rhel75 | 
vgpu   |
  | 56aeda96-f193-49fc-914d-8b507674eb16 | instance0  | ACTIVE | 
private=fda2:f16f:605e:0:f816:3eff:fef2:8e20, 10.0.0.12, 172.24.4.2 | rhel75 | 
vgpu   |
  
+--------------------------------------+------------+--------+---------------------------------------------------------------------+--------+--------+

  - Nova error:
  {u'message': u'Exceeded maximum number of retries. Exhausted all hosts 
available for retrying build failures for instance 
de2a5078-6acd-4ffd-9895-d664adb42296.', u'code': 500, u'details': u'  File 
"/opt/stack/nova/nova/conductor/manager.py", line 579, in build_instances\n    
raise exception.MaxRetriesExceeded(reason=msg)\n', u'created': 
u'2018-07-05T07:32:52Z'} |

  - Libvirt error:
  messages:Jul  5 03:32:51 host2 nova-compute: #033[00m: libvirtError: 
Requested operation is not valid: mediated device 
/sys/bus/mdev/devices/25f56195-9719-4380-a90b-084d64307e06 is in use by driver 
QEMU, domain instance-00000019
  messages:Jul  5 03:32:51 host2 nova-compute: #033[01;31mERROR 
nova.virt.libvirt.driver [#033[01;36mNone 
req-e04582ed-de22-4bfa-9253-92e687328a4c #033[00;36mservice nova#033[01;31m] 
#033[01;35m[instance: de2a5078-6acd-4ffd-9895-d664adb42296] #033[01;31mFailed 
to start libvirt guest#033[00m: libvirtError: Requested operation is not valid: 
mediated device /sys/bus/mdev/devices/25f56195-9719-4380-a90b-084d64307e06 is 
in use by driver QEMU, domain instance-00000019

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1780225/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

Reply via email to