Reviewed: https://review.openstack.org/302792 Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=e532ee3fccd0820f9ab0efc417ee787fb8c870e9 Submitter: Jenkins Branch: master
commit e532ee3fccd0820f9ab0efc417ee787fb8c870e9 Author: Oleg Bondarev <[email protected]> Date: Thu Apr 7 16:45:52 2016 +0300 Notify resource_versions from agents only when needed resource_versions were included into agent state reports recently to support rolling upgrades (commit 97a272a892fcf488949eeec4959156618caccae8) The downside is that it brought additional processing when handling state reports on server side: update of local resources versions cache and more seriously rpc casts to all other servers to do the same. All this led to a visible performance degradation at scale with hundreds of agents constantly sending reports. Under load (rally test) agents may start "blinking" which makes cluster very unstable. In fact there is no need to send and update resource_versions in each state report. I see two cases when it should be done: 1) agent was restarted (after it was upgraded); 2) agent revived - which means that server was not receiving or being able to process state reports for some time (agent_down_time). During that time agent might be upgraded and restarted. So this patch makes agents include resource_versions info only on startup. After agent revival server itself will update version_manager with resource_versions taken from agent DB record - this is to avoid version_manager being outdated. Closes-Bug: #1567497 Change-Id: I47a9869801f4e8f8af2a656749166b6fb49bcd3b ** Changed in: neutron Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1567497 Title: resource_versions in agents state reports led to performance degradation Status in neutron: Fix Released Bug description: resource_versions were included into agent state reports recently to support rolling upgrades (commit 97a272a892fcf488949eeec4959156618caccae8) The downside is that it brought additional processing when handling state reports on server side: update of local resources versions cache and more seriously rpc casts to all other servers to do the same. All this led to a visible performance degradation at scale with hundreds of agents constantly sending reports. Under load (rally test) agents may start "blinking" which makes cluster very unstable. Need to optimize agents notifications about resource_versions. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1567497/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

