I don't think this is something that we can deal with in openstack. This
is likely a kernel/lvm issue.

** Changed in: nova
       Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1097905

Title:
  Poor VM disk performance on host using LVM Mirroring

Status in OpenStack Compute (Nova):
  Invalid

Bug description:
  When running an OpenStack VM on a host machine that uses LVM Mirroring
  for the filesystem that is hosting /var/lib/nova, performance can be
  1/10th of native hard drive speeds due to some latency issue with LVM
  Mirroring.

  To reproduce the problem:

  1. Install OpenStack controller/compute node on a single machine. Ensure that 
the root filesystem, which hosts /var/lib/nova/, is backed by an LVM Mirror. 
The setup I had was 4 drives, 1 master and 3 mirrors.
  2. Run dd if=/dev/zero of=/tmp/test.dat bs=1G count=1 oflag=direct on the 
host and ensure near-native disk write speeds. My test showed ~124MB/s.
  3. Start an OpenStack VM and run dd if=/dev/zero of=/tmp/test.dat bs=1G 
count=1 oflag=direct in the VM and you should get terrible disk write speeds. 
My test showed ~13MB/s.

  To solve the problem:

  1. On the host machine, do lvconvert -m0 for the root filesystem. Ensure 
near-native disk write speeds by running the dd command above.
  2. On the VM, run the dd command above. Disk speeds should be at least 50% or 
more of the host's native disk write speeds.

  This is most likely a libvirt or LVM2 issue, but it only surfaced when
  using OpenStack and LVM2 Mirroring together.

  Other important configuration details:

  Host VM:  Ubuntu 12.04.1 LTS (GNU/Linux 3.2.0-35-generic x86_64)

   # dpkg -l "*nova*" | grep nova
  ii  nova-ajax-console-proxy          
2012.1.3+stable-20120827-4d2a4afe-0ubuntu1 OpenStack Compute - AJAX console 
proxy - transitional package
  ii  nova-api                         
2012.1.3+stable-20120827-4d2a4afe-0ubuntu1 OpenStack Compute - API frontend
  ii  nova-cert                        
2012.1.3+stable-20120827-4d2a4afe-0ubuntu1 OpenStack Compute - certificate 
management
  ii  nova-common                      
2012.1.3+stable-20120827-4d2a4afe-0ubuntu1 OpenStack Compute - common files
  ii  nova-compute                     
2012.1.3+stable-20120827-4d2a4afe-0ubuntu1 OpenStack Compute - compute node
  ii  nova-compute-kvm                 
2012.1.3+stable-20120827-4d2a4afe-0ubuntu1 OpenStack Compute - compute node 
(KVM)
  ii  nova-consoleauth                 
2012.1.3+stable-20120827-4d2a4afe-0ubuntu1 OpenStack Compute - Console 
Authenticator
  ii  nova-doc                         
2012.1.3+stable-20120827-4d2a4afe-0ubuntu1 OpenStack Compute - documentation
  ii  nova-network                     
2012.1.3+stable-20120827-4d2a4afe-0ubuntu1 OpenStack Compute - Network manager
  ii  nova-scheduler                   
2012.1.3+stable-20120827-4d2a4afe-0ubuntu1 OpenStack Compute - virtual machine 
scheduler
  ii  nova-volume                      
2012.1.3+stable-20120827-4d2a4afe-0ubuntu1 OpenStack Compute - storage
  ii  python-nova                      
2012.1.3+stable-20120827-4d2a4afe-0ubuntu1 OpenStack Compute Python libraries
  ii  python-novaclient                2012.1-0ubuntu1                          
  client library for OpenStack Compute API

  # dpkg -l "*lvm*"
  ii  lvm2                             2.02.66-4ubuntu7.1                       
  The Linux Logical Volume Manager

  # dpkg -l "*virt*" | grep libvirt
  ii  libvirt-bin                      0.9.8-2ubuntu17.4                        
  programs for the libvirt library
  ii  libvirt0                         0.9.8-2ubuntu17.4                        
  library for interfacing with different virtualization systems
  ii  python-libvirt                   0.9.8-2ubuntu17.4                        
  libvirt Python bindings

  Here's the question I asked on the #openstack IRC channel, and nobody
  seemed to know the answer to it:

  """
  I'm having virtio disk read/write slowness issues, and I'm trying to debug if 
libvirt is setup correctly.
  We are using libvirt via OpenStack. We have two OpenStack setups that are 
showing the same poor read performance.
  We're running Ubuntu 12.04 for the hosts and the VMs, OpenStack Essex. From 
what I can tell, the VMs are using virtio. Host is using ext4, VMs are using 
ext4. Raw disk speed for the host machine is 120MB/s, but VMs top out at 
10-20MB/s.
  The command I'm using to benchmark is dd if=/dev/zero of=/tmp/test.dat bs=1G 
count=1 oflag=direct
  I have double-checked the libvirt.xml files to ensure that they have the 
appropriate entries for <driver type='qcow2' cache='none'/> and <target 
dev='vda' bus='virtio'/>
  The VM kernel log says "Booting paravirtualized kernel on KVM", and has the 
following VirtIO drivers via lspci: 00:03.0 Ethernet controller: Red Hat, Inc 
Virtio network device, 00:04.0 SCSI storage controller: Red Hat, Inc Virtio 
block device, 00:05.0 RAM memory: Red Hat, Inc Virtio memory balloon
  I have also booted just a plain 'ol cirros image via 'kvm -m 1024 -drive 
file=cirros.img,if=virtio,index=0 -boot c -net nic -net user -nographic -vnc 
:0' and gotten dismal read/write speeds  1-2MB/s
  So, this leads me to believe that libvirt may be setup incorrectly, but I 
don't know how where to start looking for issues... anyone on here have any 
pointers?
  """

  Further testing showed the host system could do 120MB/s of throughput,
  but the disk latencies were quite high (via bonnie++):

  (LVM Mirrored)
  Version  1.96       ------Sequential Output------ --Sequential Input- 
--Random-
  Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- 
--Seeks--
  Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec 
%CP
  production-1 31G  1004  88 120741  23 56172  18  3479  62 141812  20  72.7   2
  Latency             22025us   10328ms   14211ms     157ms     233ms    1104ms 
                     <------------ !!!HIGH LATENCIES!!!
  Version  1.96       ------Sequential Create------ --------Random 
Create--------
  production-1     -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
                files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec 
%CP
                   16  5570   5 +++++ +++ 11761   9 20993  16 +++++ +++ 18943  
13
  Latency               528us    1127us     232ms     522us      59us     735us
  
1.96,1.96,production-1,1,1357612517,31G,,1004,88,120741,23,56172,18,3479,62,141812,20,72.7,2,16,,,,,5570,5,+++++,+++,11761,9,20993,16,+++++,+++,18943,13,22025us,10328ms,14211ms,157ms,233ms,1104ms,528us,1127us,232ms,522us,59us,735us

  So, I moved /var/lib/nova to a ramdisk and that helped performance
  tremendously (540MB/s throughput from inside the VMs). I then mounted
  a simple disk with ext3 on /var/lib/nova and that showed good
  throughput as well (128MB/s on the host, 65MB/s on the VM). I then
  tested drive + LVM + ext3 (same good performance). That left LVM
  mirroring on the main OpenStack VM host as the only culprit. I removed
  LVM mirroring via lvcreate -m0 and disk throughput for all of the VMs
  jumped from 13MB/s up to 65MB/s - 85MB/s. Note that I only had one VM
  running at a time to ensure that this wasn't disk contention between
  multiple VMs.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1097905/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

Reply via email to