[FROG] kernel default route inactive when installed after FRR 7.5 starts

Andrew J. Schorr aschorr at telemetry-investments.com
Sun Nov 14 15:48:51 UTC 2021


Hi,

I have attached two gzipped log files. The first shows the buggy case
where the system boots up and FRR starts before DHCP acquires the default
address. This gives an idea of the timing:

[root at ti14 frr]# journalctl -b | egrep -i 'frr|dhcp'
Nov 14 10:32:55 ti14 systemd[1]: Starting FRRouting...
Nov 14 10:32:56 ti14 watchfrr[904]: watchfrr 7.5 starting: vty at 0
Nov 14 10:32:56 ti14 watchfrr[904]: zebra state -> down : initial connection attempt failed
Nov 14 10:32:56 ti14 watchfrr[904]: ospfd state -> down : initial connection attempt failed
Nov 14 10:32:56 ti14 watchfrr[904]: staticd state -> down : initial connection attempt failed
Nov 14 10:32:56 ti14 watchfrr[904]: Forked background command [pid 905]: /usr/lib/frr/watchfrr.sh restart all
Nov 14 10:32:56 ti14 watchfrr.sh[913]: Cannot stop staticd: pid file not found
Nov 14 10:32:56 ti14 watchfrr.sh[915]: Cannot stop ospfd: pid file not found
Nov 14 10:32:56 ti14 watchfrr.sh[917]: Cannot stop zebra: pid file not found
Nov 14 10:32:56 ti14 watchfrr[904]: zebra state -> up : connect succeeded
Nov 14 10:32:56 ti14 watchfrr[904]: ospfd state -> up : connect succeeded
Nov 14 10:32:56 ti14 watchfrr[904]: staticd state -> up : connect succeeded
Nov 14 10:32:56 ti14 watchfrr[904]: all daemons up, doing startup-complete notify
Nov 14 10:32:56 ti14 frrinit.sh[707]: Started watchfrr
Nov 14 10:32:56 ti14 systemd[1]: Started FRRouting.
Nov 14 10:33:00 ti14 dhclient[1823]: DHCPREQUEST on lan1 to 255.255.255.255 port 67 (xid=0x7fabd46c)
Nov 14 10:33:07 ti14 dhclient[1823]: DHCPREQUEST on lan1 to 255.255.255.255 port 67 (xid=0x7fabd46c)
Nov 14 10:33:21 ti14 dhclient[1823]: DHCPDISCOVER on lan1 to 255.255.255.255 port 67 interval 8 (xid=0xe1a98663)
Nov 14 10:33:29 ti14 dhclient[1823]: DHCPDISCOVER on lan1 to 255.255.255.255 port 67 interval 10 (xid=0xe1a98663)
Nov 14 10:33:39 ti14 dhclient[1823]: DHCPDISCOVER on lan1 to 255.255.255.255 port 67 interval 19 (xid=0xe1a98663)
Nov 14 10:33:39 ti14 dhclient[1823]: DHCPREQUEST on lan1 to 255.255.255.255 port 67 (xid=0xe1a98663)
Nov 14 10:33:39 ti14 dhclient[1823]: DHCPOFFER from 10.22.200.1
Nov 14 10:33:39 ti14 dhclient[1823]: DHCPACK from 10.22.200.1 (xid=0xe1a98663)

The second logfile was from an FRR restart where the default kernel route
was installed prior to FRR's startup. In that case, everything works properly.

Would it be better to open a bug for this issue? Perhaps it's fixed in newer
code. I tried the frr-8.1-02.el8.x86_64.rpm from your repo, but it
frankly didn't work at all -- the ospf config was somehow not loaded. I didn't
spend any time investigating; I guess there must be some major changes
in the configuration language. It didn't seem worth much effort in view
of the fact that quagga works properly.

Regards,
Andy

On Sun, Nov 14, 2021 at 07:49:09AM -0500, Donald Sharp wrote:
> Can you add `debug zebra rib detail` to the top of your log file and recreate
> this issue?  We should have special code that always allows the kernel route
> received over netlink.  I would be interested in understanding what is going
> wrong.
> 
> donald
> 
> On Sat, Nov 13, 2021 at 6:04 PM Andrew J. Schorr <
> aschorr at telemetry-investments.com> wrote:
> 
>     Hi,
> 
>     I'm upgrading a bunch of Linux routers from CentOS 7 to Rocky 8, and as
>     part of
>     the upgrade, quagga seems to have been replaced by frr.  For the most part,
>     everything works fine, but I've encountered one problem. I've got a router
>     that
>     picks up a default route via DHCP from a cable modem. With quagga, this
>     default
>     route was accepted and redistributed via OSPF. But with FRR, it sometimes
>     says that the route is "inactive", which horks my routing.
> 
>     I built the Fedora 34 quagga package and ran that and saw these results
>     using quagga-1.2.4-17.el8.x86_64:
> 
>     Hello, this is Quagga (version 1.2.4).
>     Copyright 1996-2005 Kunihiro Ishiguro, et al.
> 
>     ti14# show ip route 0.0.0.0/0
>     Routing entry for 0.0.0.0/0
>       Known via "ospf", distance 110, metric 103, tag 0, vrf 0
>       Last update 00:00:13 ago
>       >  192.168.39.5, via lan0.9
> 
>     Routing entry for 0.0.0.0/0
>       Known via "kernel", distance 0, metric 0, tag 0, vrf 0, best, fib
>       >* 207.237.112.1, via lan1
> 
>     But with the standard frr-7.5-4.el8.x86_64.rpm, it sometimes marks
>     the kernel route as inactive when it starts, and uses the ospf route
>     instead:
> 
>     Hello, this is FRRouting (version 7.5).
>     Copyright 1996-2005 Kunihiro Ishiguro, et al.
> 
>     ti14# show ip route 0.0.0.0/0
>     Routing entry for 0.0.0.0/0
>       Known via "ospf", distance 110, metric 103, best
>       Last update 00:04:42 ago
>       * 192.168.39.5, via lan0.9, weight 1
> 
>     Routing entry for 0.0.0.0/0
>       Known via "kernel", distance 0, metric 0
>       Last update 00:05:42 ago
>       * 207.237.112.1, via lan1 inactive
> 
>     When it's working properly, typically after a restart, I see:
> 
>     Hello, this is FRRouting (version 7.5).
>     Copyright 1996-2005 Kunihiro Ishiguro, et al.
> 
>     ti14# show ip route 0.0.0.0/0
>     Routing entry for 0.0.0.0/0
>       Known via "ospf", distance 110, metric 103
>       Last update 00:00:01 ago
>         192.168.39.5, via lan0.9, weight 1
> 
>     Routing entry for 0.0.0.0/0
>       Known via "kernel", distance 0, metric 0, best
>       Last update 00:00:08 ago
>       * 207.237.112.1, via lan1
> 
>     My best guess is that there's some kind of timing issue here. When
>     the system boots up with FRR, the FRR daemons start before DHCP
>     installs the default route. That seems to lead to its being marked
>     inactive.
>     If I then restart FRR, it accepts the kernel default route.
> 
>     Is this perhaps fixed in a newer version of FRR? Or am I doing something
>     stupid? Is there a patch for this? If not, I'm going to need to revert
>     to quagga.
> 
>     Thanks,
>     Andy
> 
>     _______________________________________________
>     frog mailing list
>     frog at lists.frrouting.org
>     https://lists.frrouting.org/listinfo/frog
> 

-- 
Andrew Schorr                      e-mail: aschorr at telemetry-investments.com
Telemetry Investments, L.L.C.      phone:  917-305-1748
152 W 36th St, #402                fax:    212-425-5550
New York, NY 10018-8765
-------------- next part --------------
A non-text attachment was scrubbed...
Name: zebra.bootup-bug.log.gz
Type: application/gzip
Size: 33925 bytes
Desc: not available
URL: <http://lists.frrouting.org/pipermail/frog/attachments/20211114/61217aa0/attachment-0002.gz>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: zebra.restart.log.gz
Type: application/gzip
Size: 7491 bytes
Desc: not available
URL: <http://lists.frrouting.org/pipermail/frog/attachments/20211114/61217aa0/attachment-0003.gz>


More information about the frog mailing list