Public bug reported:

ENV: stable/victoria

We have enabled DVR and have some huge virtual routers with around
60 router interfaces scheduled on around 800 compute nodes.

In a large scale cloud deployment, when restart neutron agent,
especially l3 agents, neutron server side will trigger too many slow DB
query. And this will cause the agent restart time to be too long to
operate.

Error log of l3-agent restart:

```
ymysql.err.OperationalError) (2013, 'Lost connection to MySQL server during 
query') [SQL: 'SELECT ml2_port_binding_levels.port_id AS 
ml2_port_binding_levels_port_id, ml2_port_binding_levels.host AS 
ml2_port_binding_levels_host, ml2_port_binding_levels.level AS 
ml2_port_binding_levels_level, ml2_port_binding_levels.driver AS 
ml2_port_binding_levels_driver, ml2_port_binding_levels.segment_id AS 
ml2_port_binding_levels_segment_id, ports_1.id AS ports_1_id \nFROM (SELECT 
routers.id AS routers_id \nFROM routers LEFT OUTER JOIN (SELECT 
routerl3agentbindings.router_id AS router_id, 
count(routerl3agentbindings.router_id) AS count \nFROM routerl3agentbindings 
INNER JOIN router_extra_attributes ON routerl3agentbindings.router_id = 
router_extra_attributes.router_id INNER JOIN routers ON routers.id = 
router_extra_attributes.router_id GROUP BY routerl3agentbindings.router_id) AS 
anon_2 ON routers.id = anon_2.router_id) AS anon_1 INNER JOIN routerports AS 
routerports_1 ON anon_1.routers_id = routerports_1.router_id INNER JOIN ports 
AS ports_1 ON ports_1.id = routerports_1.port_id INNER JOIN 
ml2_port_binding_levels ON ports_1.id = ml2_port_binding_levels.port_id ORDER 
BY ports_1.id'] (Background on this error at: http://sqlalche.me/e/e3q8)
```
as well as

```console
                   
SELECT ml2_port_binding_levels.port_id AS ml2_port_binding_levels_port_id, 
ml2_port_binding_levels.host AS ml2_port_binding_levels_host, 
ml2_port_binding_levels.level AS ml2_port_binding_levels_level, 
ml2_port_binding_levels.driver AS ml2_port_binding_levels_driver, 
ml2_port_binding_levels.segment_id AS ml2_port_binding_levels_segment_id, 
ports_1.id AS ports_1_id  FROM (SELECT routers.id AS routers_id  FROM routers 
LEFT OUTER JOIN (SELECT routerl3agentbindings.router_id AS router_id, 
count(routerl3agentbindings.router_id) AS count  FROM routerl3agentbindings 
INNER JOIN router_extra_attributes ON routerl3agentbindings.router_id = 
router_extra_attributes.router_id INNER JOIN routers ON routers.id = 
router_extra_attributes.router_id GROUP BY routerl3agentbindings.router_id) AS 
anon_2 ON routers.id = anon_2.router_id) AS anon_1 INNER JOIN routerports AS 
routerports_1 ON anon_1.routers_id = routerports_1.router_id INNER JOIN ports 
AS ports_1 ON ports_1.id = routerports_1.port_id INNER JOIN 
ml2_port_binding_levels ON ports_1.id = ml2_port_binding_levels.port_id;


SELECT ml2_port_binding_levels.port_id AS ml2_port_binding_levels_port_id, 
ml2_port_binding_levels.host AS ml2_port_binding_levels_host, 
ml2_port_binding_levels.level AS ml2_port_binding_levels_level, 
ml2_port_binding_levels.driver AS ml2_port_binding_levels_driver, 
ml2_port_binding_levels.segment_id AS ml2_port_binding_levels_segment_id, 
ports_1.id AS ports_1_id
FROM (SELECT DISTINCT routerports.port_id AS routerports_port_id
FROM routerports
WHERE routerports.router_id IN ('6e4ed0f5-e1b0-4cf1-931d-b30c93433719') AND 
routerports.port_type IN ('network:router_interface', 
'network:ha_router_replicated_interface', 
'network:router_interface_distributed')) AS anon_1 INNER JOIN ports AS ports_1 
ON ports_1.id = anon_1.routerports_port_id INNER JOIN ml2_port_binding_levels 
ON ports_1.id = ml2_port_binding_levels.port_id ORDER BY ports_1.id;


```

from show processlist. and we saw excessive amounts of slow queries for
ml2_port_binding_levels which is weird because it looks like not
necessary.

** Affects: neutron
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1976355

Title:
  remove eager subquery load for PortBindingLevel

Status in neutron:
  New

Bug description:
  ENV: stable/victoria

  We have enabled DVR and have some huge virtual routers with around
  60 router interfaces scheduled on around 800 compute nodes.

  In a large scale cloud deployment, when restart neutron agent,
  especially l3 agents, neutron server side will trigger too many slow
  DB query. And this will cause the agent restart time to be too long to
  operate.

  Error log of l3-agent restart:

  ```
  ymysql.err.OperationalError) (2013, 'Lost connection to MySQL server during 
query') [SQL: 'SELECT ml2_port_binding_levels.port_id AS 
ml2_port_binding_levels_port_id, ml2_port_binding_levels.host AS 
ml2_port_binding_levels_host, ml2_port_binding_levels.level AS 
ml2_port_binding_levels_level, ml2_port_binding_levels.driver AS 
ml2_port_binding_levels_driver, ml2_port_binding_levels.segment_id AS 
ml2_port_binding_levels_segment_id, ports_1.id AS ports_1_id \nFROM (SELECT 
routers.id AS routers_id \nFROM routers LEFT OUTER JOIN (SELECT 
routerl3agentbindings.router_id AS router_id, 
count(routerl3agentbindings.router_id) AS count \nFROM routerl3agentbindings 
INNER JOIN router_extra_attributes ON routerl3agentbindings.router_id = 
router_extra_attributes.router_id INNER JOIN routers ON routers.id = 
router_extra_attributes.router_id GROUP BY routerl3agentbindings.router_id) AS 
anon_2 ON routers.id = anon_2.router_id) AS anon_1 INNER JOIN routerports AS 
routerports_1 ON anon_1.routers_id = routerports_1.router_id INNER JOIN ports 
AS ports_1 ON ports_1.id = routerports_1.port_id INNER JOIN 
ml2_port_binding_levels ON ports_1.id = ml2_port_binding_levels.port_id ORDER 
BY ports_1.id'] (Background on this error at: http://sqlalche.me/e/e3q8)
  ```
  as well as

  ```console
                     
  SELECT ml2_port_binding_levels.port_id AS ml2_port_binding_levels_port_id, 
ml2_port_binding_levels.host AS ml2_port_binding_levels_host, 
ml2_port_binding_levels.level AS ml2_port_binding_levels_level, 
ml2_port_binding_levels.driver AS ml2_port_binding_levels_driver, 
ml2_port_binding_levels.segment_id AS ml2_port_binding_levels_segment_id, 
ports_1.id AS ports_1_id  FROM (SELECT routers.id AS routers_id  FROM routers 
LEFT OUTER JOIN (SELECT routerl3agentbindings.router_id AS router_id, 
count(routerl3agentbindings.router_id) AS count  FROM routerl3agentbindings 
INNER JOIN router_extra_attributes ON routerl3agentbindings.router_id = 
router_extra_attributes.router_id INNER JOIN routers ON routers.id = 
router_extra_attributes.router_id GROUP BY routerl3agentbindings.router_id) AS 
anon_2 ON routers.id = anon_2.router_id) AS anon_1 INNER JOIN routerports AS 
routerports_1 ON anon_1.routers_id = routerports_1.router_id INNER JOIN ports 
AS ports_1 ON ports_1.id = routerports_1.port_id INNER JOIN 
ml2_port_binding_levels ON ports_1.id = ml2_port_binding_levels.port_id;

  
  SELECT ml2_port_binding_levels.port_id AS ml2_port_binding_levels_port_id, 
ml2_port_binding_levels.host AS ml2_port_binding_levels_host, 
ml2_port_binding_levels.level AS ml2_port_binding_levels_level, 
ml2_port_binding_levels.driver AS ml2_port_binding_levels_driver, 
ml2_port_binding_levels.segment_id AS ml2_port_binding_levels_segment_id, 
ports_1.id AS ports_1_id
  FROM (SELECT DISTINCT routerports.port_id AS routerports_port_id
  FROM routerports
  WHERE routerports.router_id IN ('6e4ed0f5-e1b0-4cf1-931d-b30c93433719') AND 
routerports.port_type IN ('network:router_interface', 
'network:ha_router_replicated_interface', 
'network:router_interface_distributed')) AS anon_1 INNER JOIN ports AS ports_1 
ON ports_1.id = anon_1.routerports_port_id INNER JOIN ml2_port_binding_levels 
ON ports_1.id = ml2_port_binding_levels.port_id ORDER BY ports_1.id;

  
  ```

  from show processlist. and we saw excessive amounts of slow queries
  for ml2_port_binding_levels which is weird because it looks like not
  necessary.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1976355/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

Reply via email to