[dev] FRR Packaging: existing guidelines and future plans

David Lamparter david at opensourcerouting.org
Tue Jun 6 12:56:37 EDT 2017


Hi Scott!


first of all - a big Thank You for looking at this!  I hope you might
have the time to pick up FRR for Debian; even if not your comments and
input is strongly appreciated!

On Sat, Jun 03, 2017 at 10:23:12PM +1000, Scott Leggett wrote:
> I see that on your wiki[0] and in at least one recent merge[1] that you
> are moving towards a single system service controlling multiple daemons.

Yes.  There are several reasons for this; the first one is actually
this:

[quote moved]
> On a totally different topic, I thought I saw somewhere that frr is
> planning to deprecate individual service config files in favour of the
> integrated config. Is that the case?

Yes, we're moving toward integrated-config mode.  This means that
daemons do not use individual config files anymore, rather they receive
their config from "vtysh -b".

(This is still marked "TBA" in the wiki page.)

In this mode, if the system service manager blindly restarts a daemon,
that daemon will sit and do nothing.  This obviously can be worked
around by scripting, but this just leads to the next problem;  FRR
features "online" configuration editing, which doesn't interact all that
nicely with sudden restarts.  Stops are even worse;  a stop + config
write will currently destroy user configuration.

I should note here that integrated-config mode does not work correctly
without watchfrr running, since watchfrr is the entity that will
actually write the configuration when the user requests this from the
CLI.  The functionality is actually in vtysh, but the vtysh process
under a user's session does not have the correct credentials to update
the configuration file.

My (plan|prediciton) here is that watchfrr will at some point become the
mandatory configuration authority across FRR.

(btw, we're not doing this out of a random caprice - a frequent user
complaint we have lies in the inconsistency when users change common,
shared knobs like prefix lists or route maps, and we then fail to have
that take effect everywhere.  This can lead to anything from an outage
up to a bad security incident if we fail to apply a critical
prefix-list.)


The second reason for the move towards a single system service is that
we're increasingly picking up features that are cross-dependent between
daemons.  The best example for this is Graceful Restart - this is
coordinated between zebra and ospfd/isisd/bgpd.  We don't have a
full-featured implementation yet, so I don't know what level of control
we really need.  I just see it coming...

> I recently removed this feature from the Debian quagga package and
> replaced it with individual services because I think that the single
> service controlling multiple daemons is a mistake.

This might be as much an issue of perspective -- these are not multiple
independent daemons.  It's increasingly becoming much more like a single
"daemon" that uses process separation to isolate crashes and scale a bit
better.  Compare with BIRD, where the functionality of the various
daemons we have is all integrated in one binary (which runs twice, for
IPv4 and IPv6 each).

It's already the case that all daemons depend on zebra[*], and if the
user sets up LDP-signalled MPLS there is also a huge dependeny on ldpd
(though only in a functional way, not in a restart-affecting way).  BFD
will be the same.

A comparison would be sshd launching the sftp-server as needed, or
pulseaudio using gconf-helper, or ibus with its subprocesses.

> There are some subtle bugs in the old debian service management scripts
> (e.g. [2],[3]), and having a special new mechanism to learn to control
> the frr daemons, which is different to the way that every other service
> works, is quite annoying.

The idea is the opposite - the mechanism to control FRR daemons would be
to control watchfrr, which can be started, stopped, restarted, and
reloaded.  Unfortunately, we're not fully there yet and probably made
things worse in the meantime because right now it's an amalgamation of
non-fitting puzzle pieces...

> From the viewpoint of a distro maintainer, it is inconsistent with the
> unified distribution experience. And finally, disallowing the system
> service manager (e.g. systemd) from having proper oversight of
> individual services leads to a significant loss in functionality (e.g.
> all the features of systemd.exec[4]).

As mentioned above, we're moving away from these being in fact
individual services.  I've scrolled through the man page of systemd.exec
and see few pieces that might be beneficial on the component-daemon
level (mostly scheduling controls and MAC/RBAC-related pieces).  On the
other hand, several features of systemd.exec actively break either FRR
as a whole or the cross-daemon functionality.

> In [0] you describe the main motivation for the single system service
> being that watchfrr is going to start and stop daemons in response to
> config or vtysh commands. Have you considered using dbus to command the
> service manager to start and stop daemons instead[5]?

This is essentially an enhancement that could be made at some point,
possibly even as a plugin (which would make it a startup-time option).
However, since both BSD systems and OpenWRT are systems that we
explicitly support, it can't be the main/default code path.

NB: the current watchfrr setup used in Cumulus Linux does exactly that,
though not via dbus; it uses a shell line that calls systemctl to
restart a daemon.  Whether we can retain this mode of operation depends
on whether we can exert sufficient control through this (I'd estimate a
probability of this remaining workable at 85% for 1 year and 70% for 2
years.)

Cheers,

-David


P.S.: the PAM functionality in GNU Zebra never did anything useful, was
never improved in Quagga, and still isn't useful in FRR.  I strongly
recommend disabling it :)  [yes, maybe we should hard-disable it in the
code until someone makes it useful.]



More information about the dev mailing list