Public bug reported:
For one of our compute machines I'm seeing two network agents that
appear unhealthy:
```
$ os network agent list | fgrep "register deleted"
| compute1 | OVN Controller agent | ("Chassis"
register deleted) | | XXX | UP | ovn-controller
|
| c085d57a-3a2b-4f97-8250-23d3f914b078 | OVN Metadata agent | ("Chassis"
register deleted) | | XXX | UP |
neutron-ovn-metadata-agent |
```
The ("Chassis" register deleted) message appears to come from the fix
for this: https://bugs.launchpad.net/neutron/+bug/1951149
Searching for that external id I can find this private chassis and it's
chassis indeed seems empty:
```
$ sudo ovn-sbctl find chassis-private | grep -A 5
e621e0fb-83d3-4a18-82b3-c842996548ed'
_uuid : e621e0fb-83d3-4a18-82b3-c842996548ed
chassis : []
external_ids :
{"neutron:liveness_check_at"="2022-06-17T08:43:33.393639+00:00",
"neutron:metadata_liveness_check_at"="2022-06-17T02:27:21.309718+00:00",
"neutron:ovn-metadata-id"="c085d57a-3a2b-4f97-8250-23d3f914b078", "ne
utron:ovn-metadata-sb-cfg"="150397"}
name : compute1
nb_cfg : 150397
nb_cfg_timestamp : 1657729945956
```
But there's also:
```
$ sudo ovn-sbctl find chassis hostname=compute1.stack
_uuid : 164cb56b-1a3c-4401-bc52-6fa5e58d8f2a
encaps : [c442312a-9dfa-4ffe-9db7-afe5f9055962]
external_ids : {datapath-type=system,
iface-types="bareudp,erspan,geneve,gre,gtpu,internal,ip6erspan,ip6gre,lisp,patch,stt,system,tap,vxlan",
is-interconn="false", "neutron:ovn-metadata-sb-cfg"="250161",
ovn-bridge-mappings="", ovn-chassis-mac-mappings="", ovn-cms-options="",
ovn-enable-lflow-cache="true", ovn-limit-lflow-cache="",
ovn-memlimit-lflow-cache-kb="", ovn-monitor-all="false",
ovn-trim-limit-lflow-cache="", ovn-trim-timeout-ms="",
ovn-trim-wmark-perc-lflow-cache="", port-up-notif="true"}
hostname : compute1.stack
name : compute1.stack
nb_cfg : 0
other_config : {datapath-type=system,
iface-types="bareudp,erspan,geneve,gre,gtpu,internal,ip6erspan,ip6gre,lisp,patch,stt,system,tap,vxlan",
is-interconn="false", ovn-bridge-mappings="", ovn-chassis-mac-mappings="",
ovn-cms-options="", ovn-enable-lflow-cache="true", ovn-limit-lflow-cache="",
ovn-memlimit-lflow-cache-kb="", ovn-monitor-all="false",
ovn-trim-limit-lflow-cache="", ovn-trim-timeout-ms="",
ovn-trim-wmark-perc-lflow-cache="", port-up-notif="true"}
transport_zones : []
vtep_logical_switches: []
$ sudo ovn-sbctl find chassis-private
chassis=164cb56b-1a3c-4401-bc52-6fa5e58d8f2a
_uuid : cbec617d-19dc-481c-ba99-b4132244773c
chassis : 164cb56b-1a3c-4401-bc52-6fa5e58d8f2a
external_ids :
{"neutron:ovn-metadata-id"="3328a0c7-081b-58a9-9e91-baf5c8c259cd",
"neutron:ovn-metadata-sb-cfg"="312321"}
name : compute1.stack
nb_cfg : 312321
nb_cfg_timestamp : 1679042105359
```
Which seems to be a correct entry -- should neutron not pick up this entry
rather than the one with "chassis : []"?
Software versions:
ii neutron-server 2:20.2.0-0ubuntu1~cloud0
all Neutron is a virtual network service for Openstack - server
ii ovn-central 22.03.0-0ubuntu1~cloud0
amd64 OVN central components
Distributor ID: Ubuntu
Description: Ubuntu 20.04.4 LTS
Release: 20.04
Codename: focal
Please let me know if I can provide more diagnostics.
** Affects: neutron
Importance: Undecided
Status: New
** Description changed:
For one of our compute machines I'm seeing two network agents that
appear unhealthy:
+ ```
$ os network agent list | fgrep "register deleted"
| compute1 | OVN Controller agent | ("Chassis"
register deleted) | | XXX | UP | ovn-controller
|
| c085d57a-3a2b-4f97-8250-23d3f914b078 | OVN Metadata agent | ("Chassis"
register deleted) | | XXX | UP |
neutron-ovn-metadata-agent |
+ ```
The ("Chassis" register deleted) message appears to come from the fix
for this: https://bugs.launchpad.net/neutron/+bug/1951149
Searching for that external id I can find this private chassis and it's
chassis indeed seems empty:
+ ```
$ sudo ovn-sbctl find chassis-private | grep -A 5
e621e0fb-83d3-4a18-82b3-c842996548ed'
_uuid : e621e0fb-83d3-4a18-82b3-c842996548ed
chassis : []
external_ids :
{"neutron:liveness_check_at"="2022-06-17T08:43:33.393639+00:00",
"neutron:metadata_liveness_check_at"="2022-06-17T02:27:21.309718+00:00",
"neutron:ovn-metadata-id"="c085d57a-3a2b-4f97-8250-23d3f914b078", "ne
utron:ovn-metadata-sb-cfg"="150397"}
name : compute1
nb_cfg : 150397
nb_cfg_timestamp : 1657729945956
+ ```
But there's also:
+ ```
$ sudo ovn-sbctl find chassis hostname=compute1.stack
_uuid : 164cb56b-1a3c-4401-bc52-6fa5e58d8f2a
encaps : [c442312a-9dfa-4ffe-9db7-afe5f9055962]
external_ids : {datapath-type=system,
iface-types="bareudp,erspan,geneve,gre,gtpu,internal,ip6erspan,ip6gre,lisp,patch,stt,system,tap,vxlan",
is-interconn="false", "neutron:ovn-metadata-sb-cfg"="250161",
ovn-bridge-mappings="", ovn-chassis-mac-mappings="", ovn-cms-options="",
ovn-enable-lflow-cache="true", ovn-limit-lflow-cache="",
ovn-memlimit-lflow-cache-kb="", ovn-monitor-all="false",
ovn-trim-limit-lflow-cache="", ovn-trim-timeout-ms="",
ovn-trim-wmark-perc-lflow-cache="", port-up-notif="true"}
hostname : compute1.stack
name : compute1.stack
nb_cfg : 0
other_config : {datapath-type=system,
iface-types="bareudp,erspan,geneve,gre,gtpu,internal,ip6erspan,ip6gre,lisp,patch,stt,system,tap,vxlan",
is-interconn="false", ovn-bridge-mappings="", ovn-chassis-mac-mappings="",
ovn-cms-options="", ovn-enable-lflow-cache="true", ovn-limit-lflow-cache="",
ovn-memlimit-lflow-cache-kb="", ovn-monitor-all="false",
ovn-trim-limit-lflow-cache="", ovn-trim-timeout-ms="",
ovn-trim-wmark-perc-lflow-cache="", port-up-notif="true"}
transport_zones : []
vtep_logical_switches: []
$ sudo ovn-sbctl find chassis-private
chassis=164cb56b-1a3c-4401-bc52-6fa5e58d8f2a
_uuid : cbec617d-19dc-481c-ba99-b4132244773c
chassis : 164cb56b-1a3c-4401-bc52-6fa5e58d8f2a
external_ids :
{"neutron:ovn-metadata-id"="3328a0c7-081b-58a9-9e91-baf5c8c259cd",
"neutron:ovn-metadata-sb-cfg"="312321"}
name : compute1.stack
nb_cfg : 312321
nb_cfg_timestamp : 1679042105359
+ ```
Which seems to be a correct entry -- should neutron not pick up this entry
rather than the one with "chassis : []"?
Software versions:
ii neutron-server 2:20.2.0-0ubuntu1~cloud0
all Neutron is a virtual network service for Openstack - server
ii ovn-central 22.03.0-0ubuntu1~cloud0
amd64 OVN central components
Distributor ID: Ubuntu
Description: Ubuntu 20.04.4 LTS
Release: 20.04
Codename: focal
+
Please let me know if I can provide more diagnostics.
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/2012104
Title:
Neutron picking incorrect ovn records
Status in neutron:
New
Bug description:
For one of our compute machines I'm seeing two network agents that
appear unhealthy:
```
$ os network agent list | fgrep "register deleted"
| compute1 | OVN Controller agent | ("Chassis"
register deleted) | | XXX | UP | ovn-controller
|
| c085d57a-3a2b-4f97-8250-23d3f914b078 | OVN Metadata agent | ("Chassis"
register deleted) | | XXX | UP |
neutron-ovn-metadata-agent |
```
The ("Chassis" register deleted) message appears to come from the fix
for this: https://bugs.launchpad.net/neutron/+bug/1951149
Searching for that external id I can find this private chassis and
it's chassis indeed seems empty:
```
$ sudo ovn-sbctl find chassis-private | grep -A 5
e621e0fb-83d3-4a18-82b3-c842996548ed'
_uuid : e621e0fb-83d3-4a18-82b3-c842996548ed
chassis : []
external_ids :
{"neutron:liveness_check_at"="2022-06-17T08:43:33.393639+00:00",
"neutron:metadata_liveness_check_at"="2022-06-17T02:27:21.309718+00:00",
"neutron:ovn-metadata-id"="c085d57a-3a2b-4f97-8250-23d3f914b078", "ne
utron:ovn-metadata-sb-cfg"="150397"}
name : compute1
nb_cfg : 150397
nb_cfg_timestamp : 1657729945956
```
But there's also:
```
$ sudo ovn-sbctl find chassis hostname=compute1.stack
_uuid : 164cb56b-1a3c-4401-bc52-6fa5e58d8f2a
encaps : [c442312a-9dfa-4ffe-9db7-afe5f9055962]
external_ids : {datapath-type=system,
iface-types="bareudp,erspan,geneve,gre,gtpu,internal,ip6erspan,ip6gre,lisp,patch,stt,system,tap,vxlan",
is-interconn="false", "neutron:ovn-metadata-sb-cfg"="250161",
ovn-bridge-mappings="", ovn-chassis-mac-mappings="", ovn-cms-options="",
ovn-enable-lflow-cache="true", ovn-limit-lflow-cache="",
ovn-memlimit-lflow-cache-kb="", ovn-monitor-all="false",
ovn-trim-limit-lflow-cache="", ovn-trim-timeout-ms="",
ovn-trim-wmark-perc-lflow-cache="", port-up-notif="true"}
hostname : compute1.stack
name : compute1.stack
nb_cfg : 0
other_config : {datapath-type=system,
iface-types="bareudp,erspan,geneve,gre,gtpu,internal,ip6erspan,ip6gre,lisp,patch,stt,system,tap,vxlan",
is-interconn="false", ovn-bridge-mappings="", ovn-chassis-mac-mappings="",
ovn-cms-options="", ovn-enable-lflow-cache="true", ovn-limit-lflow-cache="",
ovn-memlimit-lflow-cache-kb="", ovn-monitor-all="false",
ovn-trim-limit-lflow-cache="", ovn-trim-timeout-ms="",
ovn-trim-wmark-perc-lflow-cache="", port-up-notif="true"}
transport_zones : []
vtep_logical_switches: []
$ sudo ovn-sbctl find chassis-private
chassis=164cb56b-1a3c-4401-bc52-6fa5e58d8f2a
_uuid : cbec617d-19dc-481c-ba99-b4132244773c
chassis : 164cb56b-1a3c-4401-bc52-6fa5e58d8f2a
external_ids :
{"neutron:ovn-metadata-id"="3328a0c7-081b-58a9-9e91-baf5c8c259cd",
"neutron:ovn-metadata-sb-cfg"="312321"}
name : compute1.stack
nb_cfg : 312321
nb_cfg_timestamp : 1679042105359
```
Which seems to be a correct entry -- should neutron not pick up this entry
rather than the one with "chassis : []"?
Software versions:
ii neutron-server 2:20.2.0-0ubuntu1~cloud0
all Neutron is a virtual network service for Openstack -
server
ii ovn-central 22.03.0-0ubuntu1~cloud0
amd64 OVN central components
Distributor ID: Ubuntu
Description: Ubuntu 20.04.4 LTS
Release: 20.04
Codename: focal
Please let me know if I can provide more diagnostics.
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/2012104/+subscriptions
--
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : [email protected]
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help : https://help.launchpad.net/ListHelp