<div dir="ltr">Hi Vivek, Lou, all,<br><br>Thanks Vivek for having taken time to respond to the message.<br>I tried to summarise extract the discussion here, and transform in terms of vty configuration.<br>We can loop at this document, when you are available, if possible by end of july ( next week for instance).<br><br>I hope this will help.<br><a target="_blank" href="https://docs.google.com/spreadsheets/d/1t608z3bIMZpHb4Juspp7FN2Ks-4nU_qMvIv4JIzje_Y/edit#gid=0">https://docs.google.com/<wbr>spreadsheets/d/<wbr>1t608z3bIMZpHb4Juspp7FN2Ks-<wbr>4nU_qMvIv4JIzje_Y/edit#gid=0</a><br><br>Please check that you have the correct software rights to modify it.<br>Also, I plan to refine the document a bit more for the next days.<br>Feel free to do the same.<br><br>More comment on vty below ([Philippe2])<br>Thanks,<br><br>Philippe<br><br>>Hi Vivek,<br>><br>>Thanks for taking the time to respond while on vacation.<br>><br>>In

 yesterday's meeting there was a request to summarize the various 

positions on this discussion.  (To ensure all understand the issue being

 discussed.)  As such, when you can, can you summarize your proposed 

config syntax for l3vpn, l2vpn and l2+l3vpn cases?<br>><br>>Thanks<br>>Lou<br>><br>>On July 12, 2017 6:05:04 AM Vivek Venkatraman <<a target="_blank" href="mailto:vivek@cumulusnetworks.com">vivek@cumulusnetworks.com</a>> wrote:<br>>> Hi Philippe,<br>>><br>>>

 I am currently on vacation in India, hence the delay in responding to 

your mail. Thank you for an in-depth review, please see inline. (For 

another couple of weeks, additional responses from me are likely to be 

delayed).<br>>><br>>><br>>> On Mon, Jul 3, 2017 at 8:58 PM, Philippe Guibert <<a target="_blank" href="mailto:philippe.guibert@6wind.com">philippe.guibert@6wind.com</a>> wrote:<br>>><br>>>     Hi Vivek,<br>>><br>>>     The note you made is very interesting, it gathers a lot of very relevant information.<br>>>     You described  how the symetric and assymetric cases of draft rfc works for frrouting. <br>>>     In addition to presenting this, you illustrated the new vty commands.<br>>><br>>>     So, I made some comments on both points.<br>>>     - on the first point, <br>>>     I would like to know why you restrict to sending RT2 with only L2 Label ?<br>>><br>>><br>>>

 A single label (or VNI) is all that is needed for EVPN-for-L2 (where 

the gateway is a different device) or even when supporting 

routing/gateway functionality (L2+L3) when operating in asymmetric mode.

 The second label (or VNI) is needed only for routing/gateway 

functionality when operating in symmetric mode.<br>>><br>>> 

There is no restriction envisioned in the implementation on a second 

label (VNI). Rather, the plan is to support both modes of routing. For 

the symmetric mode, in the case of VxLAN, the second label (VNI) is the 

"L3 VNI" and will be available through the proposed configuration (in 

this mail).<br>>><br><br>[Philippe2] OK<br><br><br>>>     

Also, I would like FRR not to be restricted to VNI, since the draft 

theorically supports network overlays other than VXLAN (NVGRE, MPLS). <br>>><br>>><br>>>

 I agree, which is why I proposed some configuration/syntax for MPLS 

here. Note that the Linux kernel doesn't yet have a good model for 

L2oMPLS.<br>>><br><br>[Philippe2] L3 VRF and  L2 VRF model, but also handling MPLS or VXLAN, independently of the layer.<br>  <br>>><br>>><br>>>     - on the second point, <br>>>     I agree on advertise-gateway issue. <br>>>     I am not totally convinced with enhancing l2vpn evpn <l3vni><br>>>    

 I think more of having a generic MAC-VRF or IP-VRF context where we 

configure RD, RT, VNI, etc...In the vrf-policy case, MAC-VRF and IP-VRF 

would have the same vty node. Only layer command would distinguish ( 

layer 3 versus layer 2). I need to bring more elaborate example for 

RT2/RT5 case.<br>>><br>>><br>>> The "EVI" syntax I 

provided below was for the MAC-VRF. Given that an IP VRF and a MAC VRF 

will be rather different (the former deals with routes and next hops and

 will/can have OSPF/BGP neighbors etc., the latter deals with MACs and 

possibly, ARP suppression), I feel the two should be kept separate. <br><br>[Philippe2]<br>I think you are specifically talking about CE configuration, whereas I was discussion about configuring PE ( with vrf-policy).<br>Maybe CE configuration can be kept as is ( this will probably be discussed through the spreadsheet).<br>On

 a previous mail, you were comparing vrf-policy to a kind of (route-map)

 policy. If this is your feeling, then I think we agree on that topic. I

 mean, that policy will apply to either MAC-VRF or IP-VRF.<br>Those VRFs will be separate. But the vrf-policy node ( currently vrf-policy) will be used for both cases.<br><br><br>>>Also,

 since EVPN-for-VxLAN provides a simple way of "auto creating" the MAC 

VRFs (the VNI is fundamentally the VRF delimiter), we should ensure the 

operator is not forced into a lot of unnecessary configuration when 1 or

 2 commands would do.<br>>><br><br>[Philippe2]<br>I agree that auto-creation is enabled from zebra side.<br>But I think this option could be enabled too in BGP on two places : CE side ( the one you proposed), and vrf-policy.<br>For PE configuration, I would allow the ability to configure a vrf-policy with kind of "auto create" mode.<br> <br><br><br><br>>> The CLI/UI I proposed in my mail was based on the above two principles.<br>>>  <br>>><br>>><br>>>     More comments below [Philippe]<br>>><br>>><br>>><br>>>     On Mon, Jun 26, 2017 at 5:59 AM, Vivek Venkatraman <<a target="_blank" href="mailto:vivek@cumulusnetworks.com">vivek@cumulusnetworks.com</a>> wrote:<br>>><br>>>         Hi Lou, Philippe, All,<br>>><br>>>        

 The PR that I submitted already addresses for the most part 

inter-subnet routing (i.e., bridge+router scenario) if employing 

asymmetric routing (<a target="_blank" href="https://tools.ietf.org/html/draft-ietf-bess-evpn-inter-subnet-forwarding">https://tools.ietf.org/html/<wbr>draft-ietf-bess-evpn-inter-<wbr>subnet-forwarding</a>

 section 4). [I say "for the most part" because some additional changes 

are needed for advertisement of gateway MACIP in the case of centralized

 gateway and a few other things.] Changes are of course needed for 

symmetric routing (section 5 of aforementioned draft). I'll describe 

both of these below. <br>>><br>>>         At the end, I'll 

propose some thoughts on extending this for other EVPN encapsulation - 

specifically MPLS - to support traditional VPLS.<br>>><br>>>         Asymmetric routing:<br>>><br>>>        

 Here, we're dealing only with host routes and the ingress VTEP/NVE will

 route to the virtual subnet where the destination is, so that the 

egress VTEP/NVE only does bridging. In VxLAN terms, we're only dealing 

with L2 VNIs which need to be provisioned on all VTEPs. MACs are learnt 

against a VLAN through kernel notifications, mapped to a VxLAN/VNI and 

advertised. Likewise, neighbor entries (ARP/ND) are learnt on an SVI by 

listening to kernel notifications; the mapping to the VxLAN/VNI is 

straightforward and MACIP routes are originated using this L2 VNI. The 

logic on the receive side is straightforward too. The received RTs map 

to the VNI which maps to the VLAN. MAC routes would be installed into 

the FDB while MACIP routes would result in neighbor entries being 

created on the SVI (corresponding to the VLAN).<br>>><br>>>        

 The above functionality is all present in the PR submitted. While the 

target of the PR was just EVPN for L2 with ARP suppression, it can 

accomplish routing too. Note that ARP suppression requires some 

additional functionality in the Linux kernel which Cumulus Networks is 

working to get into the upstream kernel.<br>>><br>>>        

 The only additional provisioning we had planned to introduce was 

whether to advertise our SVI MAC or not - needed only on gateway 

devices. This was to be under "address-family l2vpn evpn":<br>>><br>>>         router bgp <as><br>>>           address-family l2vpn evpn<br>>>             advertise-default-gateway<br>>><br>>>     [Philippe] <br>>>     It picks up default gateway MAC address of the local VNI endpoint ?<br>>>     This seems ok for local VNI endpoints.<br>>><br>>><br>>>

 Correct. Plus, this is needed only in the centralized gateway scenario,

 not in a distributed gateway scenario (where every VTEP/NVE does 

L2+L3).<br>>>  <br>>><br>>>      <br>>><br>>>         However, I'll propose some changes to the provisioning at the end of this note.<br>>><br>>>         Symmetric routing:<br>>><br>>>        

 Clearly, this is more scalable and brings in the "inter-connect subnet"

 (L3 VNI). It also introduces the ability to do prefix routing with EVPN

 type-5 routes.<br>>><br>>><br>>>      <br>>><br>>>        

 The L3 VNI is a parameter per tenant - i.e., per L3 VRF. This is 

planned to be the only required/mandatory configuration on top of what 

my PR introduces. The tenant (L3 VRF) configuration already exists<br>>><br>>><br>>>      <br>>><br>>>        

 and the L3 VNI was going to be added to it. The RD and RTs (for the 

tenant) could be auto-derived from this L3 VNI, but could optionally be 

configured.<br>>><br>>>         The planned configuration is/was:<br>>><br>>>         router bgp <as> vrf <tenant VRF><br>>>           <any existing configuration such as "redistribute connected" or "network"><br>>>           l2vpn evpn l3vni <vni><br>>>           rd <RD><br>>>           route-target <import | export | both> <RT><br>>><br>>><br>>>     [Philippe]<br>>>     I am not sure about the vty you propose.<br>>>     If I understand well, you propose to use l2vpn keyword directly under router bgp node ?<br>>><br>>>     (config)# router bgp <> vrf <><br>>>     (bgpd)# l2vpn evpn l3vni <vni>                     <--- added command<br>>>     (bgpd)# rd   <>                            <wbr>                   <--- added command<br>>>     (bgpd)# route-target <>                            <wbr>    <--- added command<br>>>     (bgpd)# address-family l2vpn evpn<br>>>     (config-router-evpn)# vni <l2vni><br>>>     (config-router-evpn)# ...<br>>>     (config-router-evpn)# exit-address-family<br>>>     (config-router-evpn)# l2vpn evpn l3vni <l3vni><br>>><br>>>     If this is it<br>>>    

 -  The relationship between MAC-VRF ( l2vni) and IP-VRF ( the l2vpn 

evpn l3vni configured by RD) is done by the configuration. Right ?<br>>>      <br>>><br>>><br>>>

 Yes, because the L3 VNI is an operator configured entity. It is 

theoretically possible to auto-generate it, though I don't think that is

 well supported by the Linux kernel.<br>>><br>>> Note that 

that line - "l2vpn evpn l3vni <vni>" - is the only "new" command 

here. The RD and RT configuration is given to complete the layer-3 

configuration but would apply for L3VPN also (subject to conclusion on 

"vrf-policy" as noted).<br>>>  <br>>><br>>><br>>>        

 If the community decision is to configure the RD and RT configs as 

"vrf-policy" against the default VRF in BGP, the above will of course 

change.<br>>><br>>><br>>><br>>><br>>>        

 The way symmetric routing operates is as follows. There is no change to

 advertisement or reception of MAC-only type-2 routes, these will only 

contain the L2 VNI.<br>>><br>>><br>>>     [Philippe] Why restrict to sending RT2 L2 VNI only ?<br>>>    

 I should elaborate an example on how vrf-policy configuration would be 

done so as to permit sending RT2 with both labels.<br>>><br>>><br>>>

 I meant for MAC-only routes which don't have an IP address, only the L2

 VNI is relevant. If you have a use case where MAC-only routes also need

 2 labels (VNIs), can you explain that?<br><br>[Philippe2]<br>I would like to elaborate a configuration involving a second label.<br>So this is only for routing/gateway functionality when operating in symmetric mode.<br><br>>>  <br>>><br>>>      <br>>><br>>>        

 For MACIP type-2 routes, when the neighbor (ARP/ND) is learnt by 

listening to a kernel notification, the SVI that the entry is learnt on 

will be part of the tenant's VRF and that will provide the L3 VNI and L3

 RTs. The RouterMAC extended community has to be added and the MAC will 

be derived from the interface corresponding to the L3 VNI (the 

"inter-connect subnet" interface). On the receive side, if the route has

 2 VNIs, the MAC and Neighbor entry will be installed against the L2 VNI

 (if present locally) as before while the IP host route will be 

processed and imported into any L3 VRFs (BGP's RIB) that match its RTs.<br>>><br>>>      <br>>>     [Philippe] <br>>>     by taking an extract of draft-ietf-bess-evpn-inter-<wbr>subnet-forwarding<br>>>     "<br>>>    

 While sending RT2 with L3VNI and L2VNI, you must ensure that RTs refer 

to MAC-VRF and IP-VRF ( as per 5.1.1 control plane operation).<br>>>     "<br>>>     What if there is no L3 VRF Matching locally ? Do you drop the whole incoming entry ?<br>>>     I think there should be a control on incoming RT2 messages, against RTs.<br>>><br>>><br>>>

 No, the RT2 wouldn't be dropped completely. My understanding is that 

the RTs in the incoming RT2 must be matched against BOTH the MAC VRFs 

(VNIs in the case of VxLAN) and the IP VRFs, and imported into either or

 both as appropriate, IF the RT2 has 2 labels (VNIs).<br>>><br><br>[Philippe2] This can be discussed in a separate thread. We want to get an agreement on vty.<br><br>>>  <br>>><br>>><br>>>    

 For that, to differentiate L3VNI from L2VNI, I would add an attribute 

per "vrf-policy" mentioning that this is an IP-VRF or a MAC-VRF.<br>>><br>>>     (vrf-policy)# layer layer_3 | layer_2<br>>><br>>>     How would you do that filtering based on a CE configuration ?<br>>><br>>><br>>> Did my response above answer this? If not, I need to understand the question some more.<br>>><br>>>  <br><br>[Philippe2] The proposal is to add an additional configuration that specifies if a VRF is MAC-VRF or IP-VRF.<br>As per your remark, on the proposed configuration, you don't need it. But I think on vrf-policy, this command could clarify.<br><br>>><br>>><br>>>      <br>>><br>>>        

 There is some special handling required because the next hop is the 

remote VTEP/NVE whose MAC should be set up as the received Router MAC.<br>>><br>>>        

 What the above shows is that there isn't an explicit hierarchy of L2 

VNIs (subnets) of a tenant to the tenant's L3 VRF...but it is present 

implicitly (the SVIs corresponding to those subnets will be assigned to 

the tenant's VRF).<br>>><br>>>         For external routing,

 the plan is that by default, any routes in the L3 VRF (in BGP's RIB) 

will be advertised to EVPN peers as type-5 routes. The current thought 

is that this can be controlled using existing route-map constructs 

(TBD). Internal (i.e., EVPN) routes are already present in the L3 VRF 

(BGP's RIB) as mentioned above. Existing route-maps can be used to 

control how these are advertised externally - currently using VRF-lite 

BGP peerings, in future using L3VPN.<br>>><br>>>         For

 inter-DC connectivity, EVPN single-hop or multi-hop peerings can be 

setup between the border EVPN routers in each DC. If some/all tenants do

 not need their L2 domain stretched across the DCs but only need L3 

connectivity (i.e., subnets contained to one DC), only EVPN type-5 

routes need to be exchanged on the inter-DC peering. The current plan is

 to implement an addition to route-map matching for that - "match evpn 

route-type <type>".<br>>><br>>><br>>>     [Philippe] <br>>>     Indeed Route Type 5 can be used with or without Route Type 2. <br>>>     I understand you want to filter out Route Type 2 entries.<br>>><br>>>     It is as if you want to filter only L3 VPN information.<br>>>     I woud propose a route-map that filters on L3 messages only ( no RT1/RT2/RT3 indeed).<br>>><br>>><br>>>

 The above is an OPTIONAL configuration. If there is EVPN peering 

between the DCs (and no other peering), by default, all routes would be 

exchanged. In the scenario mentioned (and possibly others), there may be

 a need to only exchange a particular type of EVPN route, in addition to

 other filters (IP, AS-path etc. already exist, we are adding support 

for MAC ACLs).<br>>><br><br>[Philippe2] Agree. so the configuration command would filter RT2 for example.<br>  <br>>><br>>><br>>>     I have a subsidiary question. <br>>>    

 Suppose you have a MPLS based framework, and you want to use MPLSVPN to

 populate the L3VPN of BGP'RIB.Do you have a method to carry that L3 

information in BGP MPLSVPN instead of using BGP EVPN RT5 ?<br>>><br>>><br>>>

 Yes, the way I envision is that there would be L3VPN peering (instead 

of EVPN peering) outside of the DC. EVPN routes within the DC would get 

installed in the VRF routing table and L3VPN can pick these up and 

advertise (with any needed policy control). L3VPN routes from the 

external side would again get installed in the VRF routing table and 

EVPN can pick these up and advertise as RT5 within the DC. I haven't 

worked out any details yet though.<br>>>  <br>[Philippe2]  this can be discussed in a separate thread , i don't think it impacts vty.<br><br>>><br>>><br>>>      <br>>><br>>>         Extending/generalizing the provisioning for the non-VxLAN use case:<br>>><br>>>      <br>>>     [Philippe] <br>>>     As per draft-ietf-bess-evpn-inter-<wbr>subnet-forwarding-03<br>>>     "The first BGP Extended Community identifies the tunnel<br>>>        type per section 4.5 of [TUNNEL-ENCAP]"<br>>><br>>>    

 You may need an extra extended community ( see rfc5512) to define the 

encapsulation type wished: VXLAN or other encapsulation type.<br>>><br>>><br>>>

 The PR submitted already carries/exchanges the ENCAP extended community

 though it is filled as VxLAN. The proposed config in this mail can be 

used to extend this to carry the desired encap.<br>>>  <br><br>[Philippe2]<br>A proposal is made on the spreadshet.<br>Done for PE, to be done for CE.<br><br>>><br>>><br>>>         <br>>><br>>><br>>>        

 In the case of EVPN for VxLAN, a VLAN is mapped to a VxLAN (VNI) by the

 operator and whether it is a single broadcast domain per EVI or 

multiple broadcast domains per VNI, the VNI is sufficient to identify 

the bridge table as per section 5.1.2 of <a target="_blank" href="https://tools.ietf.org/html/draft-ietf-bess-evpn-overlay">https://tools.ietf.org/html/<wbr>draft-ietf-bess-evpn-overlay</a>. This does lend itself to a rather simplified configuration for VxLAN that would be a big advantage to retain.<br>>><br>>>        

 Whether EVPN should be used for VNIs or not (i.e., "advertise-all-vni" 

under BGP L2VPN/EVPN address-family configuration in my PR) should move 

to the entity (i.e., zebra) which creates/handles EVIs.<br>>><br>>><br>>>     [Philippe] <br>>>     I understand you want to have similar command to zebra.<br>>>     Nonetheless, I think bgp should keep it too ( for RT auto derivation, but also to control zebra events)<br>>>      <br>>><br>>><br>>>        

 The term "vni" is specific to VxLAN and cannot be used for other EVPN. 

Our preference is for "evi" but it is up to the community to decide 

whether "evi", "vsi" or something else is the most appropriate.<br>>><br>>>      <br>>><br>>><br>>>        

 For VxLAN, it is convenient to refer to the EVI (Ethernet Virtual 

Instance) by its VNI for the common case; for other cases, there is no 

such well-known identifier and the EVI is likely to be identified by 

name (just like a L3 VRF).<br>>><br>>><br>>>     [Philippe] <br>>>     As per draft-ietf-bess-evpn-inter-<wbr>subnet-forwarding-03, 5.1.1<br>>>        - Label-1 = MPLS Label or VNID corresponding to MAC-VRF<br>>>        - Label-2 = MPLS Label or VNID corresponding to IP-VRF<br>>><br>>>     It seems VNI can apply to IP-VRF too.<br>>>     I would propose to pick up the definition of the draft : <br>>><br>>>     "Label " = "MPLS Label or VNID"<br>>><br>>><br>>> Hmm...are you saying to use "label" instead of "vni" in the configuration commands?<br>>>  <br><br>[Philippe2] I am less afirmative than previously.<br>On vrf-policy mode, there is already label keyword.<br>However,  on global configuration mode, an additional label should be configurable add-vrf<br><br><br>>><br>>>      <br>>><br>>><br>>>        

 The proposed commands are as follows. These are initial thoughts 

subject to more refinement - partly because the Linux kernel does not 

currently have a forwarding model for L2oMPLS.<br>>><br>>>         l2vpn evpn advertise-vni <all | list of VNIs><br>>>        

 -- The handler of this command will be "zebra" and it is in lieu of the

 "advertise-all-vni" command as stated above.<br>>>         -- This only applies if using EVPN for VxLAN<br>>><br>>>     l2vpn evpn evi <name><br>>><br>>>           encapsulation <vxlan | mpls><br>>>           bridge-table <table | bridge-name><br>>>           <any MPLS/label allocation parameters - if encap is mpls><br>>>           <any VxLAN parameters - if encap is vxlan><br>>>         -- The above syntax/commands will be used to create EVIs for MPLS, and if needed, for VxLAN.<br>>>         -- The handler of these commands will be "zebra"<br>>><br>>><br>>>     [Philippe] <br>>>     BGPd is the only daemon interested in getting the VNI information ?<br>>><br><br>[Philippe2]<br>Yes, zebra is the daemon that gathers that information.<br>I omitted it.<br><br>>><br>>>

 No, zebra continues to be the entity interacting with the kernel, both 

for learning all the L2 info (bridges, bridge ports, VLAN-VNI mappings, 

MACs etc.) and neighbors as well as installing into the kernel.<br>>><br>>>

 We have some nascent thoughts on splitting/reorganizing zebra further, 

but nothing planned in the near term and will certainly be discussed in 

detail before anything is attempted.<br>>><br>>>  <br>>><br>>>     Also, the current level of FRR deliberately gives EVPN access to VNI only.<br>>>     That implies that Ethernet NVO tunnel is neither MPLS nor NVGRE.<br>>><br>>>     If yes, then no need to keep advertise-vni on bgpd. <br>>>     If no, then I would want to control the information on both sides.<br>>>      <br>>><br>>>         router bgp <as><br>>>           l2vpn evpn { vni <vni> | evi <name> }<br>>><br>>>     [Philippe] <br>>>     I have a configuration issue, if you want to do RT2 emission with both L2 and L3 Label.<br>>>     Could you please elaborate ?<br>>><br>>><br>>>

 It is the presence of this configuration that will determine that RT2 

should have a second label. In the case of VxLAN, the L3 VNI value would

 be provided here, in the case of MPLS (or something else), the EVI 

would have some appropriate configuration to generate this.<br>>>  <br>>><br>>>      <br>>><br>>>             rd <rd><br>>>             route-target <import | export | both> <rt><br>>>        

 -- The above syntax/commands will be used to define the RD/RT 

parameters for a VNI/EVI if the auto-derivation is not desired.<br>>>         -- The handler of the above will clearly be "bgpd"<br>>><br>>>         The L3 VNI configuration - which is against a L3 VRF - is as proposed earlier.<br>>><br>>>        

 The "advertise-default-gateway" configuration for asymmetric routing 

can be modified based on the final consensus on the above.<br>>><br>>><br>>>     [Philippe] In the CE purpose, this command is ok for me.<br>>><br>>>     Thanks,<br>>><br>>>     Philippe</div>