Hello Donald, On 25/03/2022 18:28, Donald Sharp wrote:
Why do we need to handle NLM_F_DUMP_INTR in normal operations? There is no dump commands being given except on startup
NLM_F_DUMP_INTR is not an error condition, it's a _normal_ condition that happens in real life when underlying structures in the kernel (such as list of addresses or routes) change between the kernel produces separate netlink messages that collectively form a response to a dump request. In such case userspace is advised to consume and drop any pending messages in its netlink socket and re-request dump. This situation is relatively easy to reproduce too, although it is not common. It happens to us though, from time to time, as we have a very dynamic virtual network. When we encounter it, it manifests as an incomplete view of requested configuration (not all interfaces, not all addresses etc.) ======================= About the problem that prompted my message: in a test case, I have four routes, two in their respective VRFs, and two pointing to veths each that lead to the respective VRF This is what I see in the kernel: XXX.YYY.172.88 via 169.254.0.249 dev dwnsnk04050608 proto static XXX.YYY.172.89 via 169.254.0.249 dev dwnsnk04050609 proto static XXX.YYY.172.88 dev i04050608000000 proto static scope link #dvrf04050608 XXX.YYY.172.89 dev i04050609000000 proto static scope link #dvrf04050609 But FRR is missing one of the four routes: ps402a-6# sh ip route XXX.YYY.172.88 Routing entry for XXX.YYY.172.88/32 Known via "kernel", distance 0, metric 0, best Last update 00:18:01 ago * 169.254.0.249, via dwnsnk04050608 ps402a-6# sh ip route XXX.YYY.172.89 Routing entry for 0.0.0.0/0 Known via "bgp", distance 20, metric 0, best Last update 02:01:52 ago * <** our default routes are listed here **> ps402a-6# sh ip route vrf dvrf04050608 XXX.YYY.172.88 Routing entry for XXX.YYY.172.88/32 Known via "kernel", distance 0, metric 0, vrf dvrf04050608, best Last update 00:18:12 ago * directly connected, i04050608000000(vrf rtvrf04050608) ps402a-6# sh ip route vrf dvrf04050609 XXX.YYY.172.89 Routing entry for XXX.YYY.172.89/32 Known via "kernel", distance 0, metric 0, vrf dvrf04050609, best Last update 00:18:15 ago * directly connected, i04050609000000(vrf rtvrf04050609) In the debug log, I see that all NEWROUTE events were processed: 2022/03/25 17:27:29 ZEBRA: [SKNFJ-G938V] RTM_NEWROUTE ipv4 unicast proto static NS 0 2022/03/25 17:27:29 ZEBRA: [K2A4T-TS83H] RTM_NEWROUTE XXX.YYY.172.88/32 vrf dvrf04050608(1581) table_id: 800000001 metric: 0 Admin Distance: 0 2022/03/25 17:27:29 ZEBRA: [KMXEB-K771Y] netlink_parse_info: netlink-dp-in (NS 0) type RTM_NEWROUTE(24), len=60, seq=395, pid=54641 2022/03/25 17:27:29 ZEBRA: [KMXEB-K771Y] netlink_parse_info: netlink-listen (NS 0) type RTM_NEWROUTE(24), len=60, seq=395, pid=54641 2022/03/25 17:27:29 ZEBRA: [SKNFJ-G938V] RTM_NEWROUTE ipv4 unicast proto static NS 0 2022/03/25 17:27:29 ZEBRA: [K2A4T-TS83H] RTM_NEWROUTE XXX.YYY.172.88/32 vrf default(0) table_id: 254 metric: 0 Admin Distance: 0 2022/03/25 17:27:29 ZEBRA: [KMXEB-K771Y] netlink_parse_info: netlink-dp-in (NS 0) type RTM_NEWROUTE(24), len=52, seq=399, pid=54641 2022/03/25 17:27:29 ZEBRA: [KMXEB-K771Y] netlink_parse_info: netlink-listen (NS 0) type RTM_NEWROUTE(24), len=52, seq=399, pid=54641 2022/03/25 17:27:29 ZEBRA: [SKNFJ-G938V] RTM_NEWROUTE ipv4 unicast proto static NS 0 2022/03/25 17:27:29 ZEBRA: [K2A4T-TS83H] RTM_NEWROUTE XXX.YYY.172.89/32 vrf dvrf04050609(1584) table_id: 800000002 metric: 0 Admin Distance: 0 2022/03/25 17:27:29 ZEBRA: [KMXEB-K771Y] netlink_parse_info: netlink-dp-in (NS 0) type RTM_NEWROUTE(24), len=60, seq=401, pid=54641 2022/03/25 17:27:29 ZEBRA: [KMXEB-K771Y] netlink_parse_info: netlink-listen (NS 0) type RTM_NEWROUTE(24), len=60, seq=401, pid=54641 2022/03/25 17:27:29 ZEBRA: [SKNFJ-G938V] RTM_NEWROUTE ipv4 unicast proto static NS 0 2022/03/25 17:27:29 ZEBRA: [K2A4T-TS83H] RTM_NEWROUTE XXX.YYY.172.89/32 vrf default(0) table_id: 254 metric: 0 Admin Distance: 0 2022/03/25 17:27:29 ZEBRA: [YXCJP-0WZWV] netlink_nexthop_msg_encode: ID (4107): 169.254.0.249, via dwnsnk04050608(1582) vrf default(0) 2022/03/25 17:27:29 ZEBRA: [R43C6-KYHWT] netlink_nexthop_msg_encode: RTM_NEWNEXTHOP, id=4107 2022/03/25 17:27:29 ZEBRA: [HYEHE-CQZ9G] nl_batch_send: netlink-dp (NS 0), batch size=48, msg cnt=1 2022/03/25 17:27:29 ZEBRA: [YXCJP-0WZWV] netlink_nexthop_msg_encode: ID (4108): 169.254.0.249, via dwnsnk04050609(1585) vrf default(0) 2022/03/25 17:27:29 ZEBRA: [R43C6-KYHWT] netlink_nexthop_msg_encode: RTM_NEWNEXTHOP, id=4108 2022/03/25 17:27:29 ZEBRA: [HYEHE-CQZ9G] nl_batch_send: netlink-dp (NS 0), batch size=48, msg cnt=1 2022/03/25 17:27:29 ZEBRA: [KMXEB-K771Y] netlink_parse_info: netlink-dp-in (NS 0) type RTM_NEWLINK(16), len=1508, seq=0, pid=0 Regards, Eugene