[frr] dst-src design/architecture intro

Tue Jan 31 06:56:15 EST 2017

Hi all,

as requested (with some delay, sorry), here's a design overview of what
dst-src (aka sourcedest) routing is supposed to do.
Disclosure:  I've been working on this in IETF context:
https://datatracker.ietf.org/doc/draft-ietf-rtgwg-dst-src-routing/

# Problem

The problem this tries to solve is making multiple uplinks available in
an IPv6 customer network that does not have its own address space in the
BGP DFZ.  Meaning, each of the uplinks assigns some fragment of that
uplink's address space.  Now since BCP38 filtering is reasonably
commonplace (yay!), these uplinks will drop incoming packets if they
have a source address from another ISP.

NAT is an obvious solution that everyone hates with a passion,
especially at the IETF.  Alternate solutions basically started out with,
well, we need to deliver the packets to the correct exit.  This meant
either tunneling/encapsulation or a change in routing lookups.

# Solution

So, the idea is to make it possible to route packets not only on
destination, but also source address.  So far, that's just policy
routing.  However, there is a major difference - the angle of approach
is "reversed":

Classic PBR divides routing lookups into separate tables based on source
address (or more compilicated, e.g. firewall mark) lookups.  Tables can
be chained together on Linux, but left to their own they have no overlap
and act fully independent.  Essentially, there are fully independent
networks with their own routing.  Most importantly, source/firewall
markers are looked up /first/.

dst-src does this the other way around.  There is only one table with
destination prefixes as usual.  However, for a particular destination -
e.g. the IPv6 default route ::/0 - there can now be more specific
"(D,S)" [Destination,Source] routes.  The source prefix is looked up
/second/ (but with "fallthrough" semantics).  This implies a model of
one single common network with specific D,S routes for a few select
targets.

On top of that, unlike with policy routing which is more or less beyond
the scope of dynamic routing protocols (the user somehow has to
configure a policy and do something like multitopology routing or
filtering between tables), dst-src is explicitly in-scope of dynamic
routing.  IS-IS gets a new sub-TLV to carry source prefix information,
and there is no policy to configure so D,S routes can be pushed through
the network like any other route.

(Compatibility and mixed deployment is a bit complicated, but solved.)

So, ultimately this was designed to solve one problem and solve it
reasonably well.  People have found a few other use cases, for example
using different assigned prefixes for different classes of service, and
doing distinct routing for these classes.  But all in all the primary
use cases is "multiple uplinks with explicitly associated prefixes".

# FRR impact

The changes to FRR simply add the infrastructure to pass D,S routes
along just like D-only routes.  It's essentially an additional attribute
on the route;  but it needs to support D,S1 + D,S2 with S1 != S2 at the
same time (so it's really an extension to the "route table key").

It's implemented as hanging a second route_table off the destination
entry.  If there are no D,S routes, this table will never even be
created.  It's only present for destination prefixes that have one or
more D,S routes.

The biggest impact is really that it makes the route_table code a bit
more complicated.

Forwarding semantics and such are all up to the kernel, FRR doesn't
care.  Linux has implemented this back in 3.x kernels;  it had some bugs
with the route cache though so AFAIR it needs something around 3.14 or
so to work properly.

# Caveats

- all of this is IPv6-only, both in the Linux kernel and FRR

- the core document specifies fallthrough lookup, e.g.
  packet to 2001:db8:2345::1  from 2001:db8:1234::1
  route  #1 2001:db8:2345::/48 src 2001:db8:eeee::/48
  route  #2 2001:db8::/32      src 2001:db8::/32
  needs to match route #2, e.g. after finding no match on destination
  2001:db8:2345::/48 the lookup needs to continue at less specific
  destination prefixes. (There is no implicit stop.)

- it's not policy routing, and it can't be abused to do "real" policy
  routing.  E.g. the draft for the IS-IS transport expects and assumes
  that the routing works as described in the core document.  Crowbaring
  it as a transport mechanism for other policy routing will make hell
  break lose.  It can replace PBR in /some/ applications, but it's
  essentially a subset.

Hope this addresses open questions,

-David