** Also affects: nova/icehouse
Importance: Undecided
Status: New
** Changed in: nova/icehouse
Milestone: None => 2014.1.2
** Changed in: nova/icehouse
Status: New => Fix Committed
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1326901
Title:
ServiceBinaryExists - binary for nova-conductor already exists
Status in OpenStack Compute (Nova):
Fix Released
Status in OpenStack Compute (nova) icehouse series:
Fix Committed
Status in “nova” package in Ubuntu:
Fix Released
Status in “nova” source package in Trusty:
Triaged
Status in “nova” source package in Utopic:
Fix Released
Bug description:
We're hitting an intermittent issue where ServiceBinaryExists is
raised for nova-conductor on deployment.
From nova-conductor's upstart log ( /var/log/upstart/nova-
conductor.log ):
2014-05-15 12:02:25.206 34494 INFO nova.openstack.common.periodic_task [-]
Skipping periodic task _periodic_update_dns because its interval is negative
2014-05-15 12:02:25.241 34494 INFO nova.openstack.common.service [-] Starting
8 workers
2014-05-15 12:02:25.242 34494 INFO nova.openstack.common.service [-] Started
child 34501
2014-05-15 12:02:25.244 34494 INFO nova.openstack.common.service [-] Started
child 34502
2014-05-15 12:02:25.246 34494 INFO nova.openstack.common.service [-] Started
child 34503
2014-05-15 12:02:25.246 34501 AUDIT nova.service [-] Starting conductor node
(version 2014.1)
2014-05-15 12:02:25.247 34502 AUDIT nova.service [-] Starting conductor node
(version 2014.1)
2014-05-15 12:02:25.247 34494 INFO nova.openstack.common.service [-] Started
child 34504
2014-05-15 12:02:25.249 34503 AUDIT nova.service [-] Starting conductor node
(version 2014.1)
2014-05-15 12:02:25.251 34504 AUDIT nova.service [-] Starting conductor node
(version 2014.1)
2014-05-15 12:02:25.254 34505 AUDIT nova.service [-] Starting conductor node
(version 2014.1)
2014-05-15 12:02:25.250 34494 INFO nova.openstack.common.service [-] Started
child 34505
2014-05-15 12:02:25.261 34494 INFO nova.openstack.common.service [-] Started
child 34506
2014-05-15 12:02:25.263 34494 INFO nova.openstack.common.service [-] Started
child 34507
2014-05-15 12:02:25.266 34494 INFO nova.openstack.common.service [-] Started
child 34508
2014-05-15 12:02:25.267 34507 AUDIT nova.service [-] Starting conductor node
(version 2014.1)
2014-05-15 12:02:25.268 34506 AUDIT nova.service [-] Starting conductor node
(version 2014.1)
2014-05-15 12:02:25.271 34508 AUDIT nova.service [-] Starting conductor node
(version 2014.1)
/usr/lib/python2.7/dist-packages/nova/openstack/common/db/sqlalchemy/session.py:379:
DeprecationWarning: BaseException.message has been deprecated as of Python 2.6
match = pattern.match(integrity_error.message)
/usr/lib/python2.7/dist-packages/nova/openstack/common/db/sqlalchemy/session.py:379:
DeprecationWarning: BaseException.message has been deprecated as of Python 2.6
match = pattern.match(integrity_error.message)
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 346, in
fire_timers
timer()
File "/usr/lib/python2.7/dist-packages/eventlet/hubs/timer.py", line 56, in
__call__
cb(*args, **kw)
File "/usr/lib/python2.7/dist-packages/eventlet/greenthread.py", line 194,
in main
2014-05-15 12:02:25.862 34502 ERROR oslo.messaging._drivers.impl_rabbit [-]
AMQP server on localhost:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying
again in 1 seconds.
result = function(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/openstack/common/service.py",
line 480, in run_service
service.start()
File "/usr/lib/python2.7/dist-packages/nova/service.py", line 172, in start
self.service_ref = self._create_service_ref(ctxt)
File "/usr/lib/python2.7/dist-packages/nova/service.py", line 224, in
_create_service_ref
service = self.conductor_api.service_create(context, svc_values)
File "/usr/lib/python2.7/dist-packages/nova/conductor/api.py", line 202, in
service_create
return self._manager.service_create(context, values)
File "/usr/lib/python2.7/dist-packages/nova/utils.py", line 966, in wrapper
return func(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/conductor/manager.py", line
461, in service_create
svc = self.db.service_create(context, values)
File "/usr/lib/python2.7/dist-packages/nova/db/api.py", line 139, in
service_create
return IMPL.service_create(context, values)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line
146, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line
521, in service_create
binary=values.get('binary'))
ServiceBinaryExists: Service with host glover binary nova-conductor exists.
2014-05-15 12:02:25.864 34503 ERROR nova.openstack.common.threadgroup [-]
Service with host glover binary nova-conductor exists.
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup
Traceback (most recent call last):
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup File
"/usr/lib/python2.7/dist-packages/nova/openstack/common/threadgroup.py", line
117, in wait
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup
x.wait()
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup File
"/usr/lib/python2.7/dist-packages/nova/openstack/common/threadgroup.py", line
49, in wait
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup
return self.thread.wait()
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup File
"/usr/lib/python2.7/dist-packages/eventlet/greenthread.py", line 168, in wait
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup
return self._exit_event.wait()
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup File
"/usr/lib/python2.7/dist-packages/eventlet/event.py", line 116, in wait
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup
return hubs.get_hub().switch()
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup File
"/usr/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 187, in switch
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup
return self.greenlet.switch()
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup File
"/usr/lib/python2.7/dist-packages/eventlet/greenthread.py", line 194, in main
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup
result = function(*args, **kwargs)
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup File
"/usr/lib/python2.7/dist-packages/nova/openstack/common/service.py", line 480,
in run_service
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup
service.start()
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup File
"/usr/lib/python2.7/dist-packages/nova/service.py", line 172, in start
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup
self.service_ref = self._create_service_ref(ctxt)
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup File
"/usr/lib/python2.7/dist-packages/nova/service.py", line 224, in
_create_service_ref
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup
service = self.conductor_api.service_create(context, svc_values)
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup File
"/usr/lib/python2.7/dist-packages/nova/conductor/api.py", line 202, in
service_create
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup
return self._manager.service_create(context, values)
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup File
"/usr/lib/python2.7/dist-packages/nova/utils.py", line 966, in wrapper
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup
return func(*args, **kwargs)
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup File
"/usr/lib/python2.7/dist-packages/nova/conductor/manager.py", line 461, in
service_create
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup svc
= self.db.service_create(context, values)
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup File
"/usr/lib/python2.7/dist-packages/nova/db/api.py", line 139, in service_create
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup
return IMPL.service_create(context, values)
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup File
"/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 146, in
wrapper
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup
return f(*args, **kwargs)
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup File
"/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 521, in
service_create
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup
binary=values.get('binary'))
2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup
ServiceBinaryExists: Service with host glover binary nova-conductor exists.
After looking into the traceback, I came across this piece of code
which starts a service. Of note is the 2nd except path. This
exception was introduced in January to handle races when creating a
record. It seems as if we are racing similarly in our situation and
might need to handle ServiceBinaryExists the same way
ServiceTopicExists is being handled. I'll submit a patch for this.
From nova/service.py:
158 def start(self):
159 verstr = version.version_string_with_package()
160 LOG.audit(_('Starting %(topic)s node (version %(version)s)'),
161 {'topic': self.topic, 'version': verstr})
162 self.basic_config_check()
163 self.manager.init_host()
164 self.model_disconnected = False
165 ctxt = context.get_admin_context()
166 try:
167 self.service_ref =
self.conductor_api.service_get_by_args(ctxt,
168 self.host, self.binary)
169 self.service_id = self.service_ref['id']
170 except exception.NotFound:
171 try:
172 self.service_ref = self._create_service_ref(ctxt)
173 except exception.ServiceTopicExists:
174 # NOTE(danms): If we race to create a record with a
sibling
175 # worker, don't fail here.
176 self.service_ref =
self.conductor_api.service_get_by_args(ctxt,
177 self.host, self.binary)
$ git blame -L 158,178 nova/service.py
b3f5aba0 (Andy Smith 2010-12-09 15:25:14 -0800 158) def
start(self):
481d6ff1 (Daniel P. Berrange 2012-12-17 12:17:59 +0000 159) verstr =
version.version_string_with_package()
481d6ff1 (Daniel P. Berrange 2012-12-17 12:17:59 +0000 160)
LOG.audit(_('Starting %(topic)s node (version %(versi
481d6ff1 (Daniel P. Berrange 2012-12-17 12:17:59 +0000 161)
{'topic': self.topic, 'version': verstr})
e8386a27 (Rafi Khardalian 2013-01-23 01:55:09 +0000 162)
self.basic_config_check()
065257fb (Vishvananda Ishaya 2010-09-23 12:43:41 -0700 163)
self.manager.init_host()
9c98cfb4 (Vishvananda Ishaya 2010-08-30 00:55:19 -0700 164)
self.model_disconnected = False
5e3da586 (Vishvananda Ishaya 2010-10-01 05:57:17 -0700 165) ctxt =
context.get_admin_context()
57a103b3 (Vishvananda Ishaya 2010-09-02 14:13:22 -0700 166) try:
e9d34263 (Russell Bryant 2013-01-14 17:39:29 -0500 167)
self.service_ref = self.conductor_api.service_get
e9d34263 (Russell Bryant 2013-01-14 17:39:29 -0500 168)
self.host, self.binary)
e34bc343 (Wenhao Xu 2013-01-21 19:07:34 +0800 169)
self.service_id = self.service_ref['id']
57a103b3 (Vishvananda Ishaya 2010-09-02 14:13:22 -0700 170) except
exception.NotFound:
f6c341b4 (Dan Smith 2014-01-27 14:03:57 -0800 171) try:
f6c341b4 (Dan Smith 2014-01-27 14:03:57 -0800 172)
self.service_ref = self._create_service_ref(c
f6c341b4 (Dan Smith 2014-01-27 14:03:57 -0800 173)
except exception.ServiceTopicExists:
f6c341b4 (Dan Smith 2014-01-27 14:03:57 -0800 174) #
NOTE(danms): If we race to create a record
f6c341b4 (Dan Smith 2014-01-27 14:03:57 -0800 175) #
worker, don't fail here.
f6c341b4 (Dan Smith 2014-01-27 14:03:57 -0800 176)
self.service_ref = self.conductor_api.service
f6c341b4 (Dan Smith 2014-01-27 14:03:57 -0800 177)
self.host, self.binary)
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1326901/+subscriptions
--
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : [email protected]
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help : https://help.launchpad.net/ListHelp