routes periodically disappear from OS RT

Volodymyr Litovka doka at funlab.cc
Wed Aug 9 20:52:18 UTC 2023


Hi colleagues,

suggest me, pls, where to look for the solution to the issue. Ubuntu 
22.04, frr 8.5.2, router configuration is below

Upon start and during some time everything is ok, but for some reasons 
after the time the following happens (reduced output) and can be fixed 
with 'clear ip ospf proc' :

While FRR RT is ok:

> /# sh ip route
> [ ... ]
> O   0.0.0.0/0 [110/10] via 100.64.98.3, blue, weight 1, 01:34:12
>                        via 100.64.99.3, orange, weight 1, 01:34:12
> B>* 0.0.0.0/0 [20/0] via 212.113.51.33, wan, weight 1, 1d00h30m
> C>* 100.64.0.0/23 is directly connected, br-int, 1d00h30m
> O>* 100.64.2.0/23 [110/20] via 100.64.98.2, blue, weight 1, 01:34:13
>   *                        via 100.64.99.2, orange, weight 1, 01:34:13
> O>* 100.64.4.0/23 [110/20] via 100.64.98.3, blue, weight 1, 01:34:13
>   *                        via 100.64.99.3, orange, weight 1, 01:34:13
> O   100.64.97.0/24 [110/10] is directly connected, wg0, weight 1, 01:34:24
> C>* 100.64.97.0/24 is directly connected, wg0, 1d00h30m
> /
>

Linux RT lacks 100.64.2.0/23 and 100.64.4.0/23 -

> /# ip route
> default nhid 229 via 212.113.51.33 dev wan proto bgp metric 20
> 100.64.0.0/23 dev br-int proto kernel scope link src 100.64.0.1
> 100.64.97.0/24 dev wg0 proto kernel scope link src 100.64.97.1
> /
>

Issuing if 'clear ip ospf proc' solves the issue - these routes appear 
in Linux RT again, until next time.

It's always ok with OSPF neighborship - hello packets are reaching all 
destinations (tcpdump confirms this :-) ) :

> /vishnu.utc.mygaru# sh ip ospf neigh
> Neighbor ID     Pri State           Up Time         Dead Time 
> Address         Interface                        RXmtL RqstL DBsmL
> 100.64.2.1        1 Full/Backup     17m52s 3.823s 100.64.98.2     
> blue:100.64.98.1 0     0     0
> 100.64.4.1        1 Full/DR         22m59s 3.403s 100.64.98.3     
> blue:100.64.98.1 0     0     0
> 100.64.2.1        1 Full/Backup     17m52s 3.823s 100.64.99.2     
> orange:100.64.99.1 0     0     0
> 100.64.4.1        1 Full/DR         22m59s 3.403s 100.64.99.3     
> orange:100.64.99.1 0     0     0
> /
>

and following daemons are running with the following arguments -

> /root at vishnu:~# ps ax |grep frr//
> // 816287 ?        S<s    0:18 /usr/lib/frr/watchfrr -d -F traditional 
> zebra bgpd ospfd staticd//
> // 816303 ?        S<sl   0:05 /usr/lib/frr/zebra -d -F traditional -A 
> 127.0.0.1 -r//
> // 816308 ?        S<sl   0:06 /usr/lib/frr/bgpd -d -F traditional -A 
> 127.0.0.1//
> // 816315 ?        S<s    1:19 /usr/lib/frr/ospfd -d -F traditional -A 
> 127.0.0.1//
> // 816318 ?        S<s    0:03 /usr/lib/frr/staticd -d -F traditional 
> -A 127.0.0.1/
>

The only information I see in the log is the following:

This is restart of the ospf process:/
/

> /Aug  9 18:45:03 vishnu ospfd[816315]: [T08NC-EWX63][EC 134217741] 
> Link State Acknowledgment: Unknown Neighbor 100.64.4.1//
> //Aug  9 18:45:03 vishnu ospfd[816315]: [T08NC-EWX63][EC 134217741] 
> Link State Acknowledgment: Unknown Neighbor 100.64.2.1//
> //Aug  9 18:45:03 vishnu ospfd[816315]: [X7SPE-Y4BTR][EC 134217741] 
> Link State Update: Unknown Neighbor 100.64.2.1 on int: 
> orange:100.64.99.1//
> //Aug  9 18:45:03 vishnu ospfd[816315]: [X7SPE-Y4BTR][EC 134217741] 
> Link State Update: Unknown Neighbor 100.64.2.1 on int: blue:100.64.98.1//
> //Aug  9 18:45:03 vishnu ospfd[816315]: [X7SPE-Y4BTR][EC 134217741] 
> Link State Update: Unknown Neighbor 100.64.2.1 on int: blue:100.64.98.1//
> //Aug  9 18:45:03 vishnu ospfd[816315]: [X7SPE-Y4BTR][EC 134217741] 
> Link State Update: Unknown Neighbor 100.64.2.1 on int: orange:100.64.99.1/
>

and a bit later, but I don't know whether this relate to the issue and 
whether they appear at the same time:

> /Aug  9 18:57:41 vishnu zebra[816303]: [RG2NH-FTSDH][EC 4043309102] 
> Kernel deleted a nexthop group with ID (300[253/254]) that we are 
> still using for a route, sending it back down//
> //Aug  9 18:57:41 vishnu zebra[816303]: [RG2NH-FTSDH][EC 4043309102] 
> Kernel deleted a nexthop group with ID (280[234/235]) that we are 
> still using for a route, sending it back down//
> //Aug  9 18:57:41 vishnu zebra[816303]: [RG2NH-FTSDH][EC 4043309102] 
> Kernel deleted a nexthop group with ID (276[]) that we are still using 
> for a route, sending it back down//
> /
>

The question - any suggestions where to look for the solution to the 
problem? Which additional information I need to gather which can help 
solve it? May be I'm missing something in OS networking/kernel 
configuration? (I, actually, wasn't tweaking configuration except 
ip.forward=1). Will appreciate any recommendations on this.

Thank you.

~~~~~~~~~~~
frr version 8.5.2
frr defaults traditional
hostname vishnu
log syslog informational
no ipv6 forwarding
service integrated-vtysh-config
!
interface blue
  description === Blue infra ===
  ip ospf dead-interval 4
  ip ospf hello-interval 1
  no ip ospf passive
exit
!
interface orange
  description === Orange infra ===
  ip ospf dead-interval 4
  ip ospf hello-interval 1
  no ip ospf passive
exit

! this is eBGP to ISP
router bgp NNNN
[ ... ]
exit

!this is eBGP to ISP
router bgp 64512 view erspan
[ ... ]
exit

!
router ospf
  ospf router-id 100.64.0.1
  passive-interface default
  network 100.64.0.0/23 area 0
  network 100.64.97.0/24 area 0
  network 100.64.98.0/24 area 0
  network 100.64.99.0/24 area 0
  network 192.x.x.x/24 area 0
  default-information originate
exit

! few these prefix lists used for BGP
ip prefix-list [ ... ]
!
end

-- 
Volodymyr Litovka
   "Vision without Execution is Hallucination." -- Thomas Edison
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.frrouting.org/pipermail/frog/attachments/20230809/54941b24/attachment.htm>


More information about the frog mailing list