[FROG] Multipod evpn deployment over eBGP MPLS.

Hendrik Meyburgh hendrikdm at gmail.com
Wed Aug 1 08:14:05 EDT 2018


Hi.

Please can someone help me, I am a bit stuck at the moment and not winning?

The image is the lab I am testing. PE1,P1,P2 and PE2 are Junos devices. The
leaf/spines are Cumulus implementations on Mellanox switches.

                                           +-------+
                         +-------+

                                           |       |
                         |       |

                      |-------------------->   P1
 |<----------------------------------------------->  P2   |

                      |                    |       |
                         |       |

                      |                    |       |
                         |       |

                      |                    +-------+
                         +-------+

  PODA                |
                          ^

+--------------------------------------------+
                           |

|                 +-------+                  |
                           |

| ^--------------->       |<--------------^  |
                           |

| |      y.y.y.y  |  PE1  |               |  |
                           |

| |               |z.z.z.z|               |  |
                           |

| |               |       |               |  |
                           |

| |               +-------+               |  |
PODB                       |

| |                                       |  |
 +--------------------------------------+

| |                                       |  |                            |
                          |          |

| |  +-------+                +-------+   |  |                            |
   +-------+              |          |

| |  |       |                |       |   |  |                            |
   |       |              |          |

| |  | Spine1|                | Spine2|   |  |                            |
   | Spine1|              |          |

| |  |       |                |       |   |  |            +               |
   |       |              |          |

| |  |       |                |       |   |  |                            |
   |       |              |          |

| |  +-------+--              +-------+   |  |                            |
   +-------+              |          |

| |      |      \--         --    |       |  |                            |
       |                  |          |

| |      |         \-   ---/      |       |  |                            |
       |                  |          |

| |      |          ---/          |       |  |                            |
       |                  |          |

| |      |      ---/    \--       |       |  |                            |
       |                  |          |

| |      v   <-/           \>     v       |  |                            |
       v                  v          |

| |  +------------+           +-------+   |  |                            |
   +-------+           +-------+     |

| |  |            |           |       |   |  |                            |
   |       |           |       |     |

| |  |            |           |Leaf2  |   |  |                            |
   |Leaf1  |           | PE2   |     |

| v--|  Leaf1     |           |       |---v  |                            |
   |       |<--------> |       |     |

|    | b.b.b.b    |           |c.c.c.c|      |                            |
   |x.x.x.x|  a.a.a.a  |g.g.g.g|     |

|    +------------+           +-------+      |                            |
   +-------+           +-------+     |

|         <--                        <-      |                            |
    ->                               |

|            \----                     \--   |                            |
 --/                                 |

+--------------------------------------------+
-|-/                                    |

                       \---                   \> Port2                --/
+--------------------------------------+

                           \----          +------------------------+</


                                \----     |                        | Port3


                                     \->  |                        |


                                          |      Testing Device    |


                                     Port1|                        |


                                          +------------------------+





Internally in both pods, the architecture is L3 and using eBGP by means of
unnumbered interfaces, advertising connected routes. In PODA PE1 is
connected to both leafs, over separate links with labeled-unicast enabled,
using implicit-null. The testing device is a Juniper SRX, with each
interface setup in a virtual router, but part of the same subnet. On the
switch side, the port facing the testing device is in a bridge, with a VNI
setup and the local tunnel endpoint the loopback address. I am also using
an SVI on the same subnet. Each port can reach the SVI locally. and testing
between Port1 and Port2 is successful via the uplinks to PE1, label
switching seems to be working correctly. There is reachability between the
loopbacks of all the leafs as well. It seems there is a problem with the
route-map since switching to labeled-unicast routes after testing ospf and
ldp. I still need to confirm that, in order to test the reachability over
the leaf/spine network, as it was working.

I am running into an issue with testing between Port1/Port2 and Port3. It
seems like all routes are present, but with the eBGP architecture, the
standard operation is to change the next-hop on external routes, but it
changes the remote vtep as well. The mac/ip from Port3 is advertised by
PODB leaf 1 with the VTEP as x.x.x.x, however displaying it on PODA Leaf1,
it reports as y.y.y.y, where y.y.y.y is the link local address of PE1.

root at poda-leaf1:~# net show evpn mac vni 1001
Number of MACs (local and remote) known for this VNI: 3
MAC               Type   Intf/Remote VTEP      VLAN
54:4b:8c:51:1c:a9 local  swp13                 1001
54:4b:8c:51:1c:ad remote y.y.y.y

root at poda-leaf1:~# net show bgp evpn route vni 1001 mac 54:4b:8c:51:1c:ad
BGP routing table entry for [2]:[0]:[0]:[48]:[54:4b:8c:51:1c:ad]
Paths: (1 available, best #1)
  Not advertised to any peer
  Route [2]:[0]:[0]:[48]:[54:4b:8c:51:1c:ad] VNI 993
  Imported from x.x.x.x:2:[2]:[0]:[0]:[48]:[54:4b:8c:51:1c:ad]
  11111 65202
    y.y.y.y from y.y.y.y (z.z.z.z)
      Origin IGP, metric 200, localpref 100, valid, external,
bestpath-from-AS 11111, best
      Extended Community: RT:65202:1001 ET:8
      AddPath ID: RX 0, TX 56
      Last update: Wed Aug  1 11:54:59 2018

I am also not understanding why the route above outputs VNI 993, it does
however seem to import correctly into 1001.

root at poda-leaf1:~# net show bgp evpn route vni 1001 vtep y.y.y.y
BGP table version is 33, local router ID is b.b.b.b
Status codes: s suppressed, d damped, h history, * valid, > best, i -
internal
Origin codes: i - IGP, e - EGP, ? - incomplete
EVPN type-2 prefix: [2]:[ESI]:[EthTag]:[MAClen]:[MAC]:[IPlen]:[IP]
EVPN type-3 prefix: [3]:[EthTag]:[IPlen]:[OrigIP]
EVPN type-5 prefix: [5]:[ESI]:[EthTag]:[IPlen]:[IP]

   Network          Next Hop            Metric LocPrf Weight Path
*> [2]:[0]:[0]:[48]:[54:4b:8c:51:1c:ad]
                    y.y.y.y           200             0 11111 65202 i
*> [2]:[0]:[0]:[48]:[54:4b:8c:51:1c:ad]:[32]:[10.2.0.200]
                    y.y.y.y           200             0 11111 65202 i
*> [3]:[0]:[32]:[x.x.x.x]
                    y.y.y.y           200             0 11111 65202 i


Checking in anything is present for the loopback of podb-leaf1
root at poda-leaf1:~# net show bgp evpn route vni 1001 vtep x.x.x.x
BGP table version is 33, local router ID is b.b.b.b
Status codes: s suppressed, d damped, h history, * valid, > best, i -
internal
Origin codes: i - IGP, e - EGP, ? - incomplete
EVPN type-2 prefix: [2]:[ESI]:[EthTag]:[MAClen]:[MAC]:[IPlen]:[IP]
EVPN type-3 prefix: [3]:[EthTag]:[IPlen]:[OrigIP]
EVPN type-5 prefix: [5]:[ESI]:[EthTag]:[IPlen]:[IP]

   Network          Next Hop            Metric LocPrf Weight Path

Displayed 9 prefixes (0 paths)


Below is the same command on the originating leaf.

root at podB-leaf-01:~# net show bgp evpn route vni 1001 mac 54:4b:8c:51:1c:ad
BGP routing table entry for [2]:[0]:[0]:[48]:[54:4b:8c:51:1c:ad]
Paths: (1 available, best #1)
  Not advertised to any peer
  Route [2]:[0]:[0]:[48]:[54:4b:8c:51:1c:ad] VNI 1001
  Local
    x.x.x.x from 0.0.0.0 (x.x.x.x)
      Origin IGP, localpref 100, weight 32768, valid, sourced, local,
bestpath-from-AS Local, best
      Extended Community: ET:8 RT:65202:1001
      AddPath ID: RX 0, TX 63
      Last update: Wed Aug  1 11:51:44 2018


Spanning the uplink port to another port on the same switch, allowed me to
look at the dataplane, and it does confirm that it is sending the traffic
to the wrong destination.

tcpdump on the uplink interface:
12:21:53.724226 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP
(17), length 134)
    b.b.b.b.20496 > y.y.y.y.4789: [no cksum] VXLAN, flags [I] (0x08), vni
1001
IP (tos 0x0, ttl 64, id 45715, offset 0, flags [none], proto ICMP (1),
length 84)
    10.2.0.1 > 10.2.0.200: ICMP echo request, id 20365, seq 231, length 64
0x0000:  4500 0086 0000 4000 4011 a2ec 29c1 77ea  E..... at .@...).w.
0x0010:  d1cb 2404 5010 12b5 0072 0000 0800 0000  ..$.P....r......
0x0020:  0003 e900 544b 8c51 1cad 544b 8c51 1ca9  ....TK.Q..TK.Q..
0x0030:  0800 4500 0054 b293 0000 4001 b349 0a02  ..E..T.... at ..I..
0x0040:  0001 0a02 00c8 0800 ae01 4f8d 00e7 5b61  ..........O...[a
0x0050:  f7a0 0009 bb7b 0809 0a0b 0c0d 0e0f 1011  .....{..........
0x0060:  1213 1415 1617 1819 1a1b 1c1d 1e1f 2021  ...............!
0x0070:  2223 2425 2627 2829 2a2b 2c2d 2e2f 3031  "#$%&'()*+,-./01
0x0080:  3233 3435 3637                           234567

It is showing the VXLAN packet and not MPLS as there is no static entry for
it as I am using labeled-unicast distribution.

I would appreciate any assistance as I have spend a lot of hours on this
deployment and just keep on failing to get the 2 pods to talk to each
other. If you can also confirm if this type of architecture is suppose to
work? Below are the config files.

poda-leaf1 interface file:

auto lo
iface lo inet loopback
    address b.b.b.b/32
auto swp3
iface swp3
    address y.y.y.z/31
    mpls-enable yes
    mtu 9178

auto bridge
iface bridge
    bridge-ports swp13 vni1001
    bridge-pvid 1
    bridge-vids 1001
    bridge-vlan-aware yes

auto vlan1001
iface vlan1001
    #hwaddress 44:39:39:FF:40:94
    address 10.2.0.150/24
    vlan-id 1001
    vlan-raw-device bridge

auto vni1001
iface vni1001
    bridge-access 1001
    bridge-arp-nd-suppress on
    bridge-learning off
    mstpctl-bpduguard yes
    mstpctl-portbpdufilter yes
    vxlan-id 1001
    vxlan-local-tunnelip b.b.b.b

poda-leaf1 frr.conf

router bgp 65200
 bgp router-id b.b.b.b
 coalesce-time 1000
 bgp bestpath as-path multipath-relax
 bgp bestpath compare-routerid
 neighbor fabric peer-group
 neighbor fabric remote-as external
 neighbor fabric description Internal Fabric Network
 neighbor fabric capability extended-nexthop
 neighbor swp47 interface peer-group fabric
 neighbor swp48 interface peer-group fabric
 neighbor y.y.y.y remote-as 11111
 neighbor y.y.y.y ebgp-multihop 3
 !
 address-family ipv4 unicast
  network b.b.b.b/32
  redistribute connected
  no neighbor y.y.y.y activate
  export vpn
 exit-address-family
 !
 address-family ipv4 labeled-unicast
  neighbor y.y.y.y activate
  neighbor y.y.y.y route-map HigherMetric in
 exit-address-family
 !
 address-family l2vpn evpn
  neighbor fabric activate
  neighbor y.y.y.y activate
  neighbor y.y.y.y route-map HigherMetric in
  advertise-all-vni
 exit-address-family
!
route-map HigherMetric permit 10
 set metric 200
!
ip route z.z.z.z/32 y.y.y.y
!
mpls label global-block 16 1000
mpls label bind b.b.b.b/32 implicit-null
mpls label bind c.c.c.c/32 102
mpls label bind x.x.x.x/32 103


podb-leaf01 interface file
auto lo
iface lo inet loopback
    address x.x.x.x/32
auto swp3
iface swp3
    address a.a.a.b/31
    mpls-enable yes
    mtu 9178
auto bridge
iface bridge
    bridge-ports swp5 swp13 vni1001
    bridge-pvid 1
    bridge-vids 1001
    bridge-vlan-aware yes

auto vlan1001
iface vlan1001
    address 10.2.0.152/24
    vlan-id 1001
    vlan-raw-device bridge

auto vni1001
iface vni1001
    bridge-access 1001
    bridge-arp-nd-suppress on
    bridge-learning off
    mstpctl-bpduguard yes
    mstpctl-portbpdufilter yes
    vxlan-id 1001
    vxlan-local-tunnelip x.x.x.x

podb-leaf1 frr.conf

router bgp 65202
 bgp router-id x.x.x.x
 coalesce-time 1000
 bgp bestpath as-path multipath-relax
 bgp bestpath compare-routerid
 neighbor fabric peer-group
 neighbor fabric remote-as external
 neighbor fabric description Internal Fabric Network
 neighbor fabric capability extended-nexthop
 neighbor swp47 interface peer-group fabric
 neighbor swp48 interface peer-group fabric
 neighbor a.a.a.a remote-as 11111
 neighbor a.a.a.a ebgp-multihop 3
 !
 address-family ipv4 unicast
  network x.x.x.x/32
  redistribute connected
  no neighbor a.a.a.a activate
  export vpn
 exit-address-family
 !
 address-family ipv4 labeled-unicast
  neighbor a.a.a.a activate
 exit-address-family
 !
 address-family l2vpn evpn
  neighbor fabric activate
  neighbor a.a.a.a activate
  advertise-all-vni
 exit-address-family
!
ip route g.g.g.g/32 a.a.a.a
!
mpls label global-block 16 1000
mpls label bind b.b.b.b/32 301
mpls label bind c.c.c.c/32 302
mpls label bind x.x.x.x/32 implicit-null
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.frrouting.org/pipermail/frog/attachments/20180801/dd516413/attachment-0001.html>


More information about the frog mailing list