Fwd: Re: [PATCH net-next 0/4] net: mpls: Allow users to configure more labels per route
Eric's question below is basically adding labels at tunnel ingress vs while traversing the LSP. I was generically increasing both to more than 2 labels. Opinions? -------- Forwarded Message -------- Subject: Re: [PATCH net-next 0/4] net: mpls: Allow users to configure more labels per route Date: Sat, 25 Mar 2017 14:15:54 -0500 From: Eric W. Biederman <ebiederm@xmission.com> To: David Ahern <dsa@cumulusnetworks.com> CC: netdev@vger.kernel.org, roopa@cumulusnetworks.com, rshearma@brocade.com David Ahern <dsa@cumulusnetworks.com> writes:
Bump the maximum number of labels for MPLS routes from 2 to 12. To keep memory consumption in check the labels array is moved to the end of mpls_nh and mpls_iptunnel_encap structs as a 0-sized array. Allocations use the maximum number of labels across all nexthops in a route for LSR and the number of labels configured for LWT.
The mpls_route layout is changed to:
+----------------------+ | mpls_route | +----------------------+ | mpls_nh 0 | +----------------------+ | alignment padding | 4 bytes for odd number of labels; 0 for even +----------------------+ | via[rt_max_alen] 0 | +----------------------+ | alignment padding | via's aligned on sizeof(unsigned long) +----------------------+ | ... |
Meaning the via follows its mpls_nh providing better locality as the number of labels increases. UDP_RR tests with namespaces shows no impact to a modest performance increase with this layout for 1 or 2 labels and 1 or 2 nexthops.
The new limit is set to 12 to cover all currently known segment routing use cases.
How does this compare with running the packet a couple of times through the mpls table to get all of the desired labels applied? I can certainly see the case in an mpls tunnel ingress where this might could be desirable. Which is something you implement in your last patch. However is it at all common to push lots of labels at once during routing? I am probably a bit naive but it seems absurd to push more than a handful of labels onto a packet as you are routing it. Eric
Hi David, [adding my colleague Bruno to the list, he may correct things I might have oversimplified on segment routing, or have a idea about 12] 2017-03-25, David Ahern:
Eric's question below is basically adding labels at tunnel ingress vs while traversing the LSP. I was generically increasing both to more than 2 labels. Opinions?
An MPLS packet may in transit receive additional labels. I most cases (all?), this will be most properly seen as a LSP hierarchy (tunneling one LSP into another LSP), so closer to a notion of ingress rather than something related to the initial LSP. But I don't know if the distinction is of importance. The cases that comes to mind would be: - tunneling into a fast-reroute bypass LSP (possibly a segment routing LSP, see segment routing TI LFA) - seamless MPLS - carrier's carrier type of deployment In these cases a router could receive an MPLS packet, and possibly after popping the topmost, push a stack of labels onto the packet. About the email below: - how did 12 end up being considered "covering all currently known segment routing use cases" ? it seems that SR could use an arbitrary number of labels (not saying 12 is a bad number, but...) - I'm not sure what Eric's idea of "running the packet a couple of times through the mpls table to get all of the desired labels applied" would mean: after the first lookup, what data would be used as key for the following lookup ? - back to your question, which seems to imply one could possibly increase number of labels for ingress without increasing number of labels for transit: isn't the same datastructure used in both to represent an mpls next hop (in RFC3031, both the ILM and FTN point to NHLFE entries, but I haven't digged enough to identify how these maps to the kernel implementation) - would a concept of a linked list of mpls_nh make sense, each with one label to impose, make sense, so that no hard limit is put on the label stack depth? -Thomas
-------- Forwarded Message -------- Subject: Re: [PATCH net-next 0/4] net: mpls: Allow users to configure more labels per route Date: Sat, 25 Mar 2017 14:15:54 -0500 From: Eric W. Biederman <ebiederm@xmission.com> To: David Ahern <dsa@cumulusnetworks.com> CC: netdev@vger.kernel.org, roopa@cumulusnetworks.com, rshearma@brocade.com
David Ahern <dsa@cumulusnetworks.com> writes:
Bump the maximum number of labels for MPLS routes from 2 to 12. To keep memory consumption in check the labels array is moved to the end of mpls_nh and mpls_iptunnel_encap structs as a 0-sized array. Allocations use the maximum number of labels across all nexthops in a route for LSR and the number of labels configured for LWT.
The mpls_route layout is changed to:
+----------------------+ | mpls_route | +----------------------+ | mpls_nh 0 | +----------------------+ | alignment padding | 4 bytes for odd number of labels; 0 for even +----------------------+ | via[rt_max_alen] 0 | +----------------------+ | alignment padding | via's aligned on sizeof(unsigned long) +----------------------+ | ... |
Meaning the via follows its mpls_nh providing better locality as the number of labels increases. UDP_RR tests with namespaces shows no impact to a modest performance increase with this layout for 1 or 2 labels and 1 or 2 nexthops.
The new limit is set to 12 to cover all currently known segment routing use cases.
How does this compare with running the packet a couple of times through the mpls table to get all of the desired labels applied?
I can certainly see the case in an mpls tunnel ingress where this might could be desirable. Which is something you implement in your last patch. However is it at all common to push lots of labels at once during routing?
I am probably a bit naive but it seems absurd to push more than a handful of labels onto a packet as you are routing it.
Eric
Hi Thomas, Good points. There’s no free lunch in fast path universe, packet recalculation has price associated with it, the most obvious things are increased latency and reduced throughput, there’s more. Sorry for repeating myself - not being linux kernel expert myself - I’d appreciate pros/cons analysis of taking different approaches, the impact of adding new code and system behavior with it. Those who expect the underlying platform to be not X86 only (Cumulus?) what are your expectation from HAL/ HW SDK prospective? Cheers, Jeff
On Mar 26, 2017, at 12:02, Thomas Morin <thomas.morin@orange.com> wrote:
Hi David,
[adding my colleague Bruno to the list, he may correct things I might have oversimplified on segment routing, or have a idea about 12]
2017-03-25, David Ahern:
Eric's question below is basically adding labels at tunnel ingress vs while traversing the LSP. I was generically increasing both to more than 2 labels. Opinions?
An MPLS packet may in transit receive additional labels. I most cases (all?), this will be most properly seen as a LSP hierarchy (tunneling one LSP into another LSP), so closer to a notion of ingress rather than something related to the initial LSP. But I don't know if the distinction is of importance.
The cases that comes to mind would be: - tunneling into a fast-reroute bypass LSP (possibly a segment routing LSP, see segment routing TI LFA) - seamless MPLS - carrier's carrier type of deployment
In these cases a router could receive an MPLS packet, and possibly after popping the topmost, push a stack of labels onto the packet.
About the email below: - how did 12 end up being considered "covering all currently known segment routing use cases" ? it seems that SR could use an arbitrary number of labels (not saying 12 is a bad number, but...) - I'm not sure what Eric's idea of "running the packet a couple of times through the mpls table to get all of the desired labels applied" would mean: after the first lookup, what data would be used as key for the following lookup ? - back to your question, which seems to imply one could possibly increase number of labels for ingress without increasing number of labels for transit: isn't the same datastructure used in both to represent an mpls next hop (in RFC3031, both the ILM and FTN point to NHLFE entries, but I haven't digged enough to identify how these maps to the kernel implementation) - would a concept of a linked list of mpls_nh make sense, each with one label to impose, make sense, so that no hard limit is put on the label stack depth?
-Thomas
-------- Forwarded Message -------- Subject: Re: [PATCH net-next 0/4] net: mpls: Allow users to configure more labels per route Date: Sat, 25 Mar 2017 14:15:54 -0500 From: Eric W. Biederman <ebiederm@xmission.com> To: David Ahern <dsa@cumulusnetworks.com> CC: netdev@vger.kernel.org, roopa@cumulusnetworks.com, rshearma@brocade.com
David Ahern <dsa@cumulusnetworks.com> writes:
Bump the maximum number of labels for MPLS routes from 2 to 12. To keep memory consumption in check the labels array is moved to the end of mpls_nh and mpls_iptunnel_encap structs as a 0-sized array. Allocations use the maximum number of labels across all nexthops in a route for LSR and the number of labels configured for LWT.
The mpls_route layout is changed to:
+----------------------+ | mpls_route | +----------------------+ | mpls_nh 0 | +----------------------+ | alignment padding | 4 bytes for odd number of labels; 0 for even +----------------------+ | via[rt_max_alen] 0 | +----------------------+ | alignment padding | via's aligned on sizeof(unsigned long) +----------------------+ | ... |
Meaning the via follows its mpls_nh providing better locality as the number of labels increases. UDP_RR tests with namespaces shows no impact to a modest performance increase with this layout for 1 or 2 labels and 1 or 2 nexthops.
The new limit is set to 12 to cover all currently known segment routing use cases.
How does this compare with running the packet a couple of times through the mpls table to get all of the desired labels applied?
I can certainly see the case in an mpls tunnel ingress where this might could be desirable. Which is something you implement in your last patch. However is it at all common to push lots of labels at once during routing?
I am probably a bit naive but it seems absurd to push more than a handful of labels onto a packet as you are routing it.
Eric
On 3/26/17 11:02 AM, Thomas Morin wrote:
Hi David,
[adding my colleague Bruno to the list, he may correct things I might have oversimplified on segment routing, or have a idea about 12]
2017-03-25, David Ahern:
Eric's question below is basically adding labels at tunnel ingress vs while traversing the LSP. I was generically increasing both to more than 2 labels. Opinions?
An MPLS packet may in transit receive additional labels. I most cases (all?), this will be most properly seen as a LSP hierarchy (tunneling one LSP into another LSP), so closer to a notion of ingress rather than something related to the initial LSP. But I don't know if the distinction is of importance.
The cases that comes to mind would be: - tunneling into a fast-reroute bypass LSP (possibly a segment routing LSP, see segment routing TI LFA) - seamless MPLS - carrier's carrier type of deployment
In these cases a router could receive an MPLS packet, and possibly after popping the topmost, push a stack of labels onto the packet.
And that's my takeaway from past discussions on this topic (number of labels).
About the email below: - how did 12 end up being considered "covering all currently known segment routing use cases" ? it seems that SR could use an arbitrary number of labels (not saying 12 is a bad number, but...)
I believe the consensus was 8 but Olivier had a use case for more. The way I have this coded means the performance impact is to users adding more and more labels - which is expected and appropriate.
- I'm not sure what Eric's idea of "running the packet a couple of times through the mpls table to get all of the desired labels applied" would mean: after the first lookup, what data would be used as key for the following lookup ?
no idea. I need him to clarify.
- back to your question, which seems to imply one could possibly increase number of labels for ingress without increasing number of labels for transit: isn't the same datastructure used in both to represent an mpls next hop (in RFC3031, both the ILM and FTN point to NHLFE entries, but I haven't digged enough to identify how these maps to the kernel implementation)
no. Ingress is handled by a lightweight tunnel infrastructure. In 'ip' terms the route specifies lwt with mpls encap. LSP MPLS is handled as a typical protocol family with its own route database.
- would a concept of a linked list of mpls_nh make sense, each with one label to impose, make sense, so that no hard limit is put on the label stack depth?
each nexthop has its own label stack. The nexthops are essentially an array at the end of the mpls route.
My 2 cents on SR:
From: David Ahern [mailto:dsa@cumulusnetworks.com] Sent: Sunday, March 26, 2017 5:23 PM
On 3/26/17 11:02 AM, Thomas Morin wrote:
Hi David,
[adding my colleague Bruno to the list, he may correct things I might have oversimplified on segment routing, or have a idea about 12]
2017-03-25, David Ahern:
Eric's question below is basically adding labels at tunnel ingress vs while traversing the LSP. I was generically increasing both to more than 2 labels. Opinions?
An MPLS packet may in transit receive additional labels. I most cases (all?), this will be most properly seen as a LSP hierarchy (tunneling one LSP into another LSP), so closer to a notion of ingress rather than something related to the initial LSP. But I don't know if the distinction is of importance.
The cases that comes to mind would be: - tunneling into a fast-reroute bypass LSP (possibly a segment routing LSP, see segment routing TI LFA) - seamless MPLS - carrier's carrier type of deployment
In these cases a router could receive an MPLS packet, and possibly after popping the topmost, push a stack of labels onto the packet.
And that's my takeaway from past discussions on this topic (number of labels).
About the email below: - how did 12 end up being considered "covering all currently known segment routing use cases" ? it seems that SR could use an arbitrary number of labels (not saying 12 is a bad number, but...)
I believe the consensus was 8 but Olivier had a use case for more.
TL;DR: the more labels, the better. SR has no theoretical max number of labels. Worst case is the use of a strict ERO/path. In this case " crossing 15 nodes, means stacking 15 labels to encode the SR tunnel." https://tools.ietf.org/html/draft-litkowski-spring-non-protected-paths-01#se... In addition, one may push entropy labels (2 labels:ELI, EL). In worst case, there may need to push multiple couple of entropy labels. e.g. when the first one would be too deep to be visible from transit node (aka LSR). 12 labels seem relatively reasonable, but may not cover all cases. IIRC, this is the order of magnitude that a good hardware implementation can perform. I would expect that a software implementation could do better and that we would want to accommodate for next generations of hardware. IOW, if the max is "for free" on a software plateform, I would call for a higher max (e.g. 16 or 20).
The way I have this coded means the performance impact is to users adding more and more labels - which is expected and appropriate.
Looks good. --Bruno
- I'm not sure what Eric's idea of "running the packet a couple of times through the mpls table to get all of the desired labels applied" would mean: after the first lookup, what data would be used as key for the following lookup ?
no idea. I need him to clarify.
- back to your question, which seems to imply one could possibly increase number of labels for ingress without increasing number of labels for transit: isn't the same datastructure used in both to represent an mpls next hop (in RFC3031, both the ILM and FTN point to NHLFE entries, but I haven't digged enough to identify how these maps to the kernel implementation)
no. Ingress is handled by a lightweight tunnel infrastructure. In 'ip' terms the route specifies lwt with mpls encap. LSP MPLS is handled as a typical protocol family with its own route database.
- would a concept of a linked list of mpls_nh make sense, each with one label to impose, make sense, so that no hard limit is put on the label stack depth?
each nexthop has its own label stack. The nexthops are essentially an array at the end of the mpls route.
_________________________________________________________________________________________________________________________ Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. Thank you.
participants (4)
-
bruno.decraene@orange.com -
David Ahern -
Jeff Tantsura -
Thomas Morin