David/Roopa - Olivier asked me about these two issues yesterday in the FRR Technical Meeting. I just wanted to make sure I didn't loose track of these questions that he had: 1) More than 2 labels in the kernel at a time, when will this be allowed in the kernel? -> David is currently working on this issue. When he is done it will be upstreamed. So soonish(tm). 2) PenUltimate Hop Popping: I know this issue is not trivial to solve. In fact, once the POP instruction perform, the packet must re-enter in the IP packet processing to determine what action must apply. A possible solution would be to process this packet as a new incoming IP packet when output interface is the loopback disregarding the IP address value. But, this issue is less urgent than the first one. Our OSPF Segment Routing implementation could announce if the router works in PenUltimate Hop Poping mode or not. So, for the moment, the option is force to yes. thanks! donald
On 3/15/17 9:05 AM, Donald Sharp wrote:
David/Roopa -
Olivier asked me about these two issues yesterday in the FRR Technical Meeting. I just wanted to make sure I didn't loose track of these questions that he had:
1) More than 2 labels in the kernel at a time, when will this be allowed in the kernel?
-> David is currently working on this issue. When he is done it will be upstreamed. So soonish(tm).
yes.
2) PenUltimate Hop Popping:
I know this issue is not trivial to solve. In fact, once the POP instruction perform, the packet must re-enter in the IP packet processing to determine what action must apply. A possible solution would be to process this packet as a new incoming IP packet when output interface is the loopback disregarding the IP address value. But, this issue is less urgent than the first one. Our OSPF Segment Routing implementation could announce if the router works in PenUltimate Hop Poping mode or not. So, for the moment, the option is force to yes.
have not thought about this at all. Can add to the to-do list and take care of it in time.
On 3/15/17 9:57 AM, Amine Kherbouche wrote:
> 1) More than 2 labels in the kernel at a time, when will this be > allowed in the kernel? >
I don't understand why iproute2 and kernel are not sync when it comes to the max stacked labels, iproute2 should export kernel headers.
Hi all, The problem is not between the kernel and iproute2. The problem comes when stacking more than 2 labels: the LSB byte of the third label is replaced by 0x03 value i.e. the implicit Null Label value. See attached thread. Following this, David propose a new patch that we integrated and tested successfully (See second and third attached mail). But, after that, I got no news and never seen this patch merge into the Linux Kernel release. So, we just ask when this patch could be merge into the Linux Kernel. For the second problem about PHP, it is also described in the first attached thread. But, agree, that this issue is more tricking to solve. Regards, Olivier Le 15/03/2017 à 16:59, David Ahern a écrit :
On 3/15/17 9:57 AM, Amine Kherbouche wrote:
> 1) More than 2 labels in the kernel at a time, when will this be > allowed in the kernel? >
I don't understand why iproute2 and kernel are not sync when it comes to the max stacked labels, iproute2 should export kernel headers.
I made the kernel patch to help you move along with your MPLS work. In the kernel thread discussing the increase in number of labels Eric Biederman mentions performance concerns about just increasing the size of the array; he wanted a much more complicated change and I have not gotten around to it. On 3/16/17 3:31 AM, Olivier Dugeon wrote:
Hi all,
The problem is not between the kernel and iproute2. The problem comes when stacking more than 2 labels: the LSB byte of the third label is replaced by 0x03 value i.e. the implicit Null Label value. See attached thread.
Following this, David propose a new patch that we integrated and tested successfully (See second and third attached mail).
But, after that, I got no news and never seen this patch merge into the Linux Kernel release. So, we just ask when this patch could be merge into the Linux Kernel.
For the second problem about PHP, it is also described in the first attached thread. But, agree, that this issue is more tricking to solve.
Regards,
Olivier
Le 15/03/2017 à 16:59, David Ahern a écrit :
On 3/15/17 9:57 AM, Amine Kherbouche wrote:
> 1) More than 2 labels in the kernel at a time, when will this be > allowed in the kernel? >
I don't understand why iproute2 and kernel are not sync when it comes to the max stacked labels, iproute2 should export kernel headers.
Hi David, Well, frankly speaking, I don't see where is the problem regarding the performance. IMHO, the patch add an extra size of the array i.e. from 8 bytes (2 labels) to 64 bytes (16 labels) which is completely negligible compared to the size of a IP packet, and of same magnitude order as a VxLan encapsulation and twice less as an IPv6 header. In addition, only the edge router as to push the label stack. Then, subsequent router just look at the top label. So, no more no less that a packet with only 2 labels in the stack. I think that dealing with a dynamic MPLS STACK DEPTH i.e. dynamic memory allocation of space regarding the number of labels are push in from of the IP packet will certainly add more overhead and more CPU cycles rather than just manage a fix amount of byte in an array. Finally, the default value for the MPLS_LABEL_STACK could be equal to 2 and let peoples want to deal with Segment Routing recompile the kernel with a larger value. From my side, your patch will not only increase the label stack, it also re-arrange the MPLS structure in order to solve the problem of corrupted third label. And, this is the more important things. In any case, let me know what I can do to help you. Regards Olivier Le 16/03/2017 à 16:10, David Ahern a écrit :
I made the kernel patch to help you move along with your MPLS work.
In the kernel thread discussing the increase in number of labels Eric Biederman mentions performance concerns about just increasing the size of the array; he wanted a much more complicated change and I have not gotten around to it.
On 3/16/17 3:31 AM, Olivier Dugeon wrote:
Hi all,
The problem is not between the kernel and iproute2. The problem comes when stacking more than 2 labels: the LSB byte of the third label is replaced by 0x03 value i.e. the implicit Null Label value. See attached thread.
Following this, David propose a new patch that we integrated and tested successfully (See second and third attached mail).
But, after that, I got no news and never seen this patch merge into the Linux Kernel release. So, we just ask when this patch could be merge into the Linux Kernel.
For the second problem about PHP, it is also described in the first attached thread. But, agree, that this issue is more tricking to solve.
Regards,
Olivier
Le 15/03/2017 à 16:59, David Ahern a écrit :
On 3/15/17 9:57 AM, Amine Kherbouche wrote:
> 1) More than 2 labels in the kernel at a time, when will this be > allowed in the kernel? >
I don't understand why iproute2 and kernel are not sync when it comes to the max stacked labels, iproute2 should export kernel headers.
I have the MPLS code changes to bump the number of labels. What did we want to use for the max? 12? 16? This limit is really only capping what we take from userspace. Kernel side memory allocations are done based on the actual number of labels in the route. On 3/16/17 10:49 AM, Olivier Dugeon wrote:
Hi David,
Well, frankly speaking, I don't see where is the problem regarding the performance.
IMHO, the patch add an extra size of the array i.e. from 8 bytes (2 labels) to 64 bytes (16 labels) which is completely negligible compared to the size of a IP packet, and of same magnitude order as a VxLan encapsulation and twice less as an IPv6 header.
In addition, only the edge router as to push the label stack. Then, subsequent router just look at the top label. So, no more no less that a packet with only 2 labels in the stack.
I think that dealing with a dynamic MPLS STACK DEPTH i.e. dynamic memory allocation of space regarding the number of labels are push in from of the IP packet will certainly add more overhead and more CPU cycles rather than just manage a fix amount of byte in an array.
Finally, the default value for the MPLS_LABEL_STACK could be equal to 2 and let peoples want to deal with Segment Routing recompile the kernel with a larger value.
From my side, your patch will not only increase the label stack, it also re-arrange the MPLS structure in order to solve the problem of corrupted third label. And, this is the more important things.
In any case, let me know what I can do to help you.
Regards
Olivier
Le 16/03/2017 à 16:10, David Ahern a écrit :
I made the kernel patch to help you move along with your MPLS work.
In the kernel thread discussing the increase in number of labels Eric Biederman mentions performance concerns about just increasing the size of the array; he wanted a much more complicated change and I have not gotten around to it.
On 3/16/17 3:31 AM, Olivier Dugeon wrote:
Hi all,
The problem is not between the kernel and iproute2. The problem comes when stacking more than 2 labels: the LSB byte of the third label is replaced by 0x03 value i.e. the implicit Null Label value. See attached thread.
Following this, David propose a new patch that we integrated and tested successfully (See second and third attached mail).
But, after that, I got no news and never seen this patch merge into the Linux Kernel release. So, we just ask when this patch could be merge into the Linux Kernel.
For the second problem about PHP, it is also described in the first attached thread. But, agree, that this issue is more tricking to solve.
Regards,
Olivier
Le 15/03/2017 à 16:59, David Ahern a écrit :
On 3/15/17 9:57 AM, Amine Kherbouche wrote:
> 1) More than 2 labels in the kernel at a time, when will this be > allowed in the kernel? >
I don't understand why iproute2 and kernel are not sync when it comes to the max stacked labels, iproute2 should export kernel headers.
Hi David, I think that 8 stacked labels are more than enough, I cannot imagine a core network with 8 stacked MPLS-VPNs. Let's see the others if they share my opinions. Regards, Amine On 23 March 2017 at 17:47, David Ahern <dsa@cumulusnetworks.com> wrote:
I have the MPLS code changes to bump the number of labels. What did we want to use for the max? 12? 16?
This limit is really only capping what we take from userspace. Kernel side memory allocations are done based on the actual number of labels in the route.
On 3/16/17 10:49 AM, Olivier Dugeon wrote:
Hi David,
Well, frankly speaking, I don't see where is the problem regarding the performance.
IMHO, the patch add an extra size of the array i.e. from 8 bytes (2 labels) to 64 bytes (16 labels) which is completely negligible compared to the size of a IP packet, and of same magnitude order as a VxLan encapsulation and twice less as an IPv6 header.
In addition, only the edge router as to push the label stack. Then, subsequent router just look at the top label. So, no more no less that a packet with only 2 labels in the stack.
I think that dealing with a dynamic MPLS STACK DEPTH i.e. dynamic memory allocation of space regarding the number of labels are push in from of the IP packet will certainly add more overhead and more CPU cycles rather than just manage a fix amount of byte in an array.
Finally, the default value for the MPLS_LABEL_STACK could be equal to 2 and let peoples want to deal with Segment Routing recompile the kernel with a larger value.
From my side, your patch will not only increase the label stack, it also re-arrange the MPLS structure in order to solve the problem of corrupted third label. And, this is the more important things.
In any case, let me know what I can do to help you.
Regards
Olivier
Le 16/03/2017 à 16:10, David Ahern a écrit :
I made the kernel patch to help you move along with your MPLS work.
In the kernel thread discussing the increase in number of labels Eric Biederman mentions performance concerns about just increasing the size of the array; he wanted a much more complicated change and I have not gotten around to it.
On 3/16/17 3:31 AM, Olivier Dugeon wrote:
Hi all,
The problem is not between the kernel and iproute2. The problem comes when stacking more than 2 labels: the LSB byte of the third label is replaced by 0x03 value i.e. the implicit Null Label value. See attached thread.
Following this, David propose a new patch that we integrated and tested successfully (See second and third attached mail).
But, after that, I got no news and never seen this patch merge into the Linux Kernel release. So, we just ask when this patch could be merge into the Linux Kernel.
For the second problem about PHP, it is also described in the first attached thread. But, agree, that this issue is more tricking to solve.
Regards,
Olivier
Le 15/03/2017 à 16:59, David Ahern a écrit :
On 3/15/17 9:57 AM, Amine Kherbouche wrote:
> 1) More than 2 labels in the kernel at a time, when will this
be
> allowed in the kernel? >
I don't understand why iproute2 and kernel are not sync when it
comes to
the max stacked labels, iproute2 should export kernel headers.
-- Amine
+Jeff Le 23 mars 2017 5:59:21 PM Amine Kherbouche <amine.kherbouche@6wind.com> a écrit :
Hi David,
I think that 8 stacked labels are more than enough, I cannot imagine a core network with 8 stacked MPLS-VPNs. Let's see the others if they share my opinions.
I think too that 8 is a good upper limit.
Regards, Amine
On 23 March 2017 at 17:47, David Ahern <dsa@cumulusnetworks.com> wrote:
I have the MPLS code changes to bump the number of labels. What did we want to use for the max? 12? 16?
This limit is really only capping what we take from userspace. Kernel side memory allocations are done based on the actual number of labels in the route.
On 3/16/17 10:49 AM, Olivier Dugeon wrote:
Hi David,
Well, frankly speaking, I don't see where is the problem regarding the performance.
IMHO, the patch add an extra size of the array i.e. from 8 bytes (2 labels) to 64 bytes (16 labels) which is completely negligible compared to the size of a IP packet, and of same magnitude order as a VxLan encapsulation and twice less as an IPv6 header.
In addition, only the edge router as to push the label stack. Then, subsequent router just look at the top label. So, no more no less that a packet with only 2 labels in the stack.
I think that dealing with a dynamic MPLS STACK DEPTH i.e. dynamic memory allocation of space regarding the number of labels are push in from of the IP packet will certainly add more overhead and more CPU cycles rather than just manage a fix amount of byte in an array.
Finally, the default value for the MPLS_LABEL_STACK could be equal to 2 and let peoples want to deal with Segment Routing recompile the kernel with a larger value.
From my side, your patch will not only increase the label stack, it also re-arrange the MPLS structure in order to solve the problem of corrupted third label. And, this is the more important things.
In any case, let me know what I can do to help you.
Regards
Olivier
Le 16/03/2017 à 16:10, David Ahern a écrit :
I made the kernel patch to help you move along with your MPLS work.
In the kernel thread discussing the increase in number of labels Eric Biederman mentions performance concerns about just increasing the size of the array; he wanted a much more complicated change and I have not gotten around to it.
On 3/16/17 3:31 AM, Olivier Dugeon wrote:
Hi all,
The problem is not between the kernel and iproute2. The problem comes when stacking more than 2 labels: the LSB byte of the third label is replaced by 0x03 value i.e. the implicit Null Label value. See attached thread.
Following this, David propose a new patch that we integrated and tested successfully (See second and third attached mail).
But, after that, I got no news and never seen this patch merge into the Linux Kernel release. So, we just ask when this patch could be merge into the Linux Kernel.
For the second problem about PHP, it is also described in the first attached thread. But, agree, that this issue is more tricking to solve.
Regards,
Olivier
Le 15/03/2017 à 16:59, David Ahern a écrit :
On 3/15/17 9:57 AM, Amine Kherbouche wrote: > > 1) More than 2 labels in the kernel at a time, when will this be > > allowed in the kernel? > > > > I don't understand why iproute2 and kernel are not sync when it comes to > the max stacked labels, iproute2 should export kernel headers. > https://marc.info/?l=linux-netdev&m=146913202123729&w=2
-- Amine
---------- _______________________________________________ frr mailing list frr@lists.nox.tf https://lists.nox.tf/listinfo/frr
while on topic, can I please get an API to query data plane for MSD supported? Use case: draft-ietf-isis-segment-routing-msd/ draft-ietf-ospf-segment-routing-msd/ draft-tantsura-idr-bgp-ls-segment-routing-msd Many thanks! Cheers, Jeff
On Mar 23, 2017, at 10:07, Vincent Jardin <vincent.jardin@6wind.com> wrote:
+Jeff
Le 23 mars 2017 5:59:21 PM Amine Kherbouche <amine.kherbouche@6wind.com> a écrit :
Hi David,
I think that 8 stacked labels are more than enough, I cannot imagine a core network with 8 stacked MPLS-VPNs. Let's see the others if they share my opinions.
I think too that 8 is a good upper limit.
Regards, Amine
On 23 March 2017 at 17:47, David Ahern <dsa@cumulusnetworks.com> wrote:
I have the MPLS code changes to bump the number of labels. What did we want to use for the max? 12? 16?
This limit is really only capping what we take from userspace. Kernel side memory allocations are done based on the actual number of labels in the route.
On 3/16/17 10:49 AM, Olivier Dugeon wrote:
Hi David,
Well, frankly speaking, I don't see where is the problem regarding the performance.
IMHO, the patch add an extra size of the array i.e. from 8 bytes (2 labels) to 64 bytes (16 labels) which is completely negligible compared to the size of a IP packet, and of same magnitude order as a VxLan encapsulation and twice less as an IPv6 header.
In addition, only the edge router as to push the label stack. Then, subsequent router just look at the top label. So, no more no less that a packet with only 2 labels in the stack.
I think that dealing with a dynamic MPLS STACK DEPTH i.e. dynamic memory allocation of space regarding the number of labels are push in from of the IP packet will certainly add more overhead and more CPU cycles rather than just manage a fix amount of byte in an array.
Finally, the default value for the MPLS_LABEL_STACK could be equal to 2 and let peoples want to deal with Segment Routing recompile the kernel with a larger value.
From my side, your patch will not only increase the label stack, it also re-arrange the MPLS structure in order to solve the problem of corrupted third label. And, this is the more important things.
In any case, let me know what I can do to help you.
Regards
Olivier
Le 16/03/2017 à 16:10, David Ahern a écrit :
I made the kernel patch to help you move along with your MPLS work.
In the kernel thread discussing the increase in number of labels Eric Biederman mentions performance concerns about just increasing the size of the array; he wanted a much more complicated change and I have not gotten around to it.
On 3/16/17 3:31 AM, Olivier Dugeon wrote:
Hi all,
The problem is not between the kernel and iproute2. The problem comes when stacking more than 2 labels: the LSB byte of the third label is replaced by 0x03 value i.e. the implicit Null Label value. See attached thread.
Following this, David propose a new patch that we integrated and tested successfully (See second and third attached mail).
But, after that, I got no news and never seen this patch merge into the Linux Kernel release. So, we just ask when this patch could be merge into the Linux Kernel.
For the second problem about PHP, it is also described in the first attached thread. But, agree, that this issue is more tricking to solve.
Regards,
Olivier
Le 15/03/2017 à 16:59, David Ahern a écrit : > On 3/15/17 9:57 AM, Amine Kherbouche wrote: >> > 1) More than 2 labels in the kernel at a time, when will this be >> > allowed in the kernel? >> > >> >> I don't understand why iproute2 and kernel are not sync when it comes to >> the max stacked labels, iproute2 should export kernel headers. >> > https://marc.info/?l=linux-netdev&m=146913202123729&w=2 <https://marc.info/?l=linux-netdev&m=146913202123729&w=2> >
-- Amine
---------- _______________________________________________ frr mailing list frr@lists.nox.tf https://lists.nox.tf/listinfo/frr <https://lists.nox.tf/listinfo/frr>
/grep '#define MAX_NEW_LABELS' internal.h | awk -e '{print $3}'/ But I'm not sure it answers your question ;-) Olivier Le 23/03/2017 à 18:21, Jeff Tantsura a écrit :
while on topic, can I please get an API to query data plane for MSD supported? Use case: draft-ietf-isis-segment-routing-msd/ draft-ietf-ospf-segment-routing-msd/ draft-tantsura-idr-bgp-ls-segment-routing-msd
Many thanks!
Cheers, Jeff
On Mar 23, 2017, at 10:07, Vincent Jardin <vincent.jardin@6wind.com <mailto:vincent.jardin@6wind.com>> wrote:
+Jeff
Le 23 mars 2017 5:59:21 PM Amine Kherbouche <amine.kherbouche@6wind.com <mailto:amine.kherbouche@6wind.com>> a écrit :
Hi David,
I think that 8 stacked labels are more than enough, I cannot imagine a core network with 8 stacked MPLS-VPNs. Let's see the others if they share my opinions.
I think too that 8 is a good upper limit.
Regards, Amine
On 23 March 2017 at 17:47, David Ahern <dsa@cumulusnetworks.com <mailto:dsa@cumulusnetworks.com>> wrote:
I have the MPLS code changes to bump the number of labels. What did we want to use for the max? 12? 16?
This limit is really only capping what we take from userspace. Kernel side memory allocations are done based on the actual number of labels in the route.
On 3/16/17 10:49 AM, Olivier Dugeon wrote:
Hi David,
Well, frankly speaking, I don't see where is the problem regarding the performance.
IMHO, the patch add an extra size of the array i.e. from 8 bytes (2 labels) to 64 bytes (16 labels) which is completely negligible compared to the size of a IP packet, and of same magnitude order as a VxLan encapsulation and twice less as an IPv6 header.
In addition, only the edge router as to push the label stack. Then, subsequent router just look at the top label. So, no more no less that a packet with only 2 labels in the stack.
I think that dealing with a dynamic MPLS STACK DEPTH i.e. dynamic memory allocation of space regarding the number of labels are push in from of the IP packet will certainly add more overhead and more CPU cycles rather than just manage a fix amount of byte in an array.
Finally, the default value for the MPLS_LABEL_STACK could be equal to 2 and let peoples want to deal with Segment Routing recompile the kernel with a larger value.
From my side, your patch will not only increase the label stack, it also re-arrange the MPLS structure in order to solve the problem of corrupted third label. And, this is the more important things.
In any case, let me know what I can do to help you.
Regards
Olivier
Le 16/03/2017 à 16:10, David Ahern a écrit :
I made the kernel patch to help you move along with your MPLS work.
In the kernel thread discussing the increase in number of labels Eric Biederman mentions performance concerns about just increasing the size of the array; he wanted a much more complicated change and I have not gotten around to it.
On 3/16/17 3:31 AM, Olivier Dugeon wrote: > Hi all, > > The problem is not between the kernel and iproute2. The problem comes > when stacking more than 2 labels: the LSB byte of the third label is > replaced by 0x03 value i.e. the implicit Null Label value. See attached > thread. > > Following this, David propose a new patch that we integrated and tested > successfully (See second and third attached mail). > > But, after that, I got no news and never seen this patch merge into the > Linux Kernel release. So, we just ask when this patch could be merge > into the Linux Kernel. > > For the second problem about PHP, it is also described in the first > attached thread. But, agree, that this issue is more tricking to solve. > > Regards, > > Olivier > > > Le 15/03/2017 à 16:59, David Ahern a écrit : >> On 3/15/17 9:57 AM, Amine Kherbouche wrote: >>> > 1) More than 2 labels in the kernel at a time, when will this be >>> > allowed in the kernel? >>> > >>> >>> I don't understand why iproute2 and kernel are not sync when it comes to >>> the max stacked labels, iproute2 should export kernel headers. >>> >> https://marc.info/?l=linux-netdev&m=146913202123729&w=2 >>
-- Amine
---------- _______________________________________________ frr mailing list frr@lists.nox.tf <mailto:frr@lists.nox.tf> https://lists.nox.tf/listinfo/frr
_______________________________________________ frr mailing list frr@lists.nox.tf https://lists.nox.tf/listinfo/frr
_________________________________________________________________________________________________________________________ Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. Thank you.
Haha :) On a serious note - what are the implications of increasing labels stack (latency/throughput/etc)? Where is the sweet spot? Cheers, Jeff
On Mar 23, 2017, at 10:28, <olivier.dugeon@orange.com> <olivier.dugeon@orange.com> wrote:
grep '#define MAX_NEW_LABELS' internal.h | awk -e '{print $3}'
But I'm not sure it answers your question ;-) Olivier
Le 23/03/2017 à 18:21, Jeff Tantsura a écrit :
while on topic, can I please get an API to query data plane for MSD supported? Use case: draft-ietf-isis-segment-routing-msd/ draft-ietf-ospf-segment-routing-msd/ draft-tantsura-idr-bgp-ls-segment-routing-msd
Many thanks!
Cheers, Jeff
On Mar 23, 2017, at 10:07, Vincent Jardin <vincent.jardin@6wind.com <mailto:vincent.jardin@6wind.com>> wrote:
+Jeff
Le 23 mars 2017 5:59:21 PM Amine Kherbouche <amine.kherbouche@6wind.com <mailto:amine.kherbouche@6wind.com>> a écrit :
Hi David,
I think that 8 stacked labels are more than enough, I cannot imagine a core network with 8 stacked MPLS-VPNs. Let's see the others if they share my opinions.
I think too that 8 is a good upper limit.
Regards, Amine
On 23 March 2017 at 17:47, David Ahern <dsa@cumulusnetworks.com <mailto:dsa@cumulusnetworks.com>> wrote:
I have the MPLS code changes to bump the number of labels. What did we want to use for the max? 12? 16?
This limit is really only capping what we take from userspace. Kernel side memory allocations are done based on the actual number of labels in the route.
On 3/16/17 10:49 AM, Olivier Dugeon wrote:
Hi David,
Well, frankly speaking, I don't see where is the problem regarding the performance.
IMHO, the patch add an extra size of the array i.e. from 8 bytes (2 labels) to 64 bytes (16 labels) which is completely negligible compared to the size of a IP packet, and of same magnitude order as a VxLan encapsulation and twice less as an IPv6 header.
In addition, only the edge router as to push the label stack. Then, subsequent router just look at the top label. So, no more no less that a packet with only 2 labels in the stack.
I think that dealing with a dynamic MPLS STACK DEPTH i.e. dynamic memory allocation of space regarding the number of labels are push in from of the IP packet will certainly add more overhead and more CPU cycles rather than just manage a fix amount of byte in an array.
Finally, the default value for the MPLS_LABEL_STACK could be equal to 2 and let peoples want to deal with Segment Routing recompile the kernel with a larger value.
From my side, your patch will not only increase the label stack, it also re-arrange the MPLS structure in order to solve the problem of corrupted third label. And, this is the more important things.
In any case, let me know what I can do to help you.
Regards
Olivier
Le 16/03/2017 à 16:10, David Ahern a écrit : > I made the kernel patch to help you move along with your MPLS work. > > In the kernel thread discussing the increase in number of labels Eric > Biederman mentions performance concerns about just increasing the size > of the array; he wanted a much more complicated change and I have not > gotten around to it. > > > On 3/16/17 3:31 AM, Olivier Dugeon wrote: >> Hi all, >> >> The problem is not between the kernel and iproute2. The problem comes >> when stacking more than 2 labels: the LSB byte of the third label is >> replaced by 0x03 value i.e. the implicit Null Label value. See attached >> thread. >> >> Following this, David propose a new patch that we integrated and tested >> successfully (See second and third attached mail). >> >> But, after that, I got no news and never seen this patch merge into the >> Linux Kernel release. So, we just ask when this patch could be merge >> into the Linux Kernel. >> >> For the second problem about PHP, it is also described in the first >> attached thread. But, agree, that this issue is more tricking to solve. >> >> Regards, >> >> Olivier >> >> >> Le 15/03/2017 à 16:59, David Ahern a écrit : >>> On 3/15/17 9:57 AM, Amine Kherbouche wrote: >>>> > 1) More than 2 labels in the kernel at a time, when will this be >>>> > allowed in the kernel? >>>> > >>>> >>>> I don't understand why iproute2 and kernel are not sync when it comes to >>>> the max stacked labels, iproute2 should export kernel headers. >>>> >>> https://marc.info/?l=linux-netdev&m=146913202123729&w=2 <https://marc.info/?l=linux-netdev&m=146913202123729&w=2> >>>
-- Amine
---------- _______________________________________________ frr mailing list frr@lists.nox.tf <mailto:frr@lists.nox.tf> https://lists.nox.tf/listinfo/frr <https://lists.nox.tf/listinfo/frr>
_______________________________________________ frr mailing list frr@lists.nox.tf <mailto:frr@lists.nox.tf> https://lists.nox.tf/listinfo/frr <https://lists.nox.tf/listinfo/frr>
_________________________________________________________________________________________________________________________
Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.
This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. Thank you.
Jeff, We don't yet conduct performance impact because we are considering it as negligible. Indeed, a label is 4 bytes. 16 labels = 64 bytes, so equivalent to a IP + TCP header, or an IPv6 header, so comparable to what's the Kernel do when it uses Ipv4 into IPv6 GRE tunnel or VxLan encapsulation or equivalent encapsulation. And 16 labels is the maximum necessary to express 100,00% of routes disregarding the network topologies we tested. In addition, this encapsulation is done at the Edge. Then, only PoP of top of stack is performed by subsequent router independently of the size of label stack. IMHO, the impact is on the total size of the packet. The overhead introduced by the stack of labels will reduce the goodput of the connexion, but not much more than other kind of encapsulation. Regards Olivier Le 23/03/2017 à 18:32, Jeff Tantsura a écrit :
Haha :)
On a serious note - what are the implications of increasing labels stack (latency/throughput/etc)? Where is the sweet spot?
Cheers, Jeff
On Mar 23, 2017, at 10:28, <olivier.dugeon@orange.com <mailto:olivier.dugeon@orange.com>> <olivier.dugeon@orange.com <mailto:olivier.dugeon@orange.com>> wrote:
/grep '#define MAX_NEW_LABELS' internal.h | awk -e '{print $3}'/
But I'm not sure it answers your question ;-)
Olivier
Le 23/03/2017 à 18:21, Jeff Tantsura a écrit :
while on topic, can I please get an API to query data plane for MSD supported? Use case: draft-ietf-isis-segment-routing-msd/ draft-ietf-ospf-segment-routing-msd/ draft-tantsura-idr-bgp-ls-segment-routing-msd
Many thanks!
Cheers, Jeff
On Mar 23, 2017, at 10:07, Vincent Jardin <vincent.jardin@6wind.com <mailto:vincent.jardin@6wind.com>> wrote:
+Jeff
Le 23 mars 2017 5:59:21 PM Amine Kherbouche <amine.kherbouche@6wind.com <mailto:amine.kherbouche@6wind.com>> a écrit :
Hi David,
I think that 8 stacked labels are more than enough, I cannot imagine a core network with 8 stacked MPLS-VPNs. Let's see the others if they share my opinions.
I think too that 8 is a good upper limit.
Regards, Amine
On 23 March 2017 at 17:47, David Ahern <dsa@cumulusnetworks.com <mailto:dsa@cumulusnetworks.com>> wrote:
I have the MPLS code changes to bump the number of labels. What did we want to use for the max? 12? 16?
This limit is really only capping what we take from userspace. Kernel side memory allocations are done based on the actual number of labels in the route.
On 3/16/17 10:49 AM, Olivier Dugeon wrote: > Hi David, > > Well, frankly speaking, I don't see where is the problem regarding the > performance. > > IMHO, the patch add an extra size of the array i.e. from 8 bytes (2 > labels) to 64 bytes (16 labels) which is completely negligible compared > to the size of a IP packet, and of same magnitude order as a VxLan > encapsulation and twice less as an IPv6 header. > > In addition, only the edge router as to push the label stack. Then, > subsequent router just look at the top label. So, no more no less that a > packet with only 2 labels in the stack. > > I think that dealing with a dynamic MPLS STACK DEPTH i.e. dynamic memory > allocation of space regarding the number of labels are push in from of > the IP packet will certainly add more overhead and more CPU cycles > rather than just manage a fix amount of byte in an array. > > Finally, the default value for the MPLS_LABEL_STACK could be equal to 2 > and let peoples want to deal with Segment Routing recompile the kernel > with a larger value. > > From my side, your patch will not only increase the label stack, it also > re-arrange the MPLS structure in order to solve the problem of corrupted > third label. And, this is the more important things. > > In any case, let me know what I can do to help you. > > Regards > > Olivier > > > Le 16/03/2017 à 16:10, David Ahern a écrit : >> I made the kernel patch to help you move along with your MPLS work. >> >> In the kernel thread discussing the increase in number of labels Eric >> Biederman mentions performance concerns about just increasing the size >> of the array; he wanted a much more complicated change and I have not >> gotten around to it. >> >> >> On 3/16/17 3:31 AM, Olivier Dugeon wrote: >>> Hi all, >>> >>> The problem is not between the kernel and iproute2. The problem comes >>> when stacking more than 2 labels: the LSB byte of the third label is >>> replaced by 0x03 value i.e. the implicit Null Label value. See attached >>> thread. >>> >>> Following this, David propose a new patch that we integrated and tested >>> successfully (See second and third attached mail). >>> >>> But, after that, I got no news and never seen this patch merge into the >>> Linux Kernel release. So, we just ask when this patch could be merge >>> into the Linux Kernel. >>> >>> For the second problem about PHP, it is also described in the first >>> attached thread. But, agree, that this issue is more tricking to solve. >>> >>> Regards, >>> >>> Olivier >>> >>> >>> Le 15/03/2017 à 16:59, David Ahern a écrit : >>>> On 3/15/17 9:57 AM, Amine Kherbouche wrote: >>>>> > 1) More than 2 labels in the kernel at a time, when will this be >>>>> > allowed in the kernel? >>>>> > >>>>> >>>>> I don't understand why iproute2 and kernel are not sync when it comes to >>>>> the max stacked labels, iproute2 should export kernel headers. >>>>> >>>> https://marc.info/?l=linux-netdev&m=146913202123729&w=2 >>>> >
-- Amine
---------- _______________________________________________ frr mailing list frr@lists.nox.tf <mailto:frr@lists.nox.tf> https://lists.nox.tf/listinfo/frr
_______________________________________________ frr mailing list frr@lists.nox.tf https://lists.nox.tf/listinfo/frr
_________________________________________________________________________________________________________________________
Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.
This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. Thank you.
_________________________________________________________________________________________________________________________ Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. Thank you.
RSVP-TE FRR + InterAS could result in 5 label stack (quite a corner case), don’t forget about entropy labels, that come in pairs (ELI, EL). Outside of SR-MPLS, 8 labels would be more than enough to cover 99.9% of use cases. Where are we on SRH (SRv6 data plane)? Cheers, Jeff
On Mar 23, 2017, at 09:59, Amine Kherbouche <amine.kherbouche@6wind.com> wrote:
Hi David,
I think that 8 stacked labels are more than enough, I cannot imagine a core network with 8 stacked MPLS-VPNs. Let's see the others if they share my opinions.
Regards, Amine
On 23 March 2017 at 17:47, David Ahern <dsa@cumulusnetworks.com <mailto:dsa@cumulusnetworks.com>> wrote: I have the MPLS code changes to bump the number of labels. What did we want to use for the max? 12? 16?
This limit is really only capping what we take from userspace. Kernel side memory allocations are done based on the actual number of labels in the route.
On 3/16/17 10:49 AM, Olivier Dugeon wrote:
Hi David,
Well, frankly speaking, I don't see where is the problem regarding the performance.
IMHO, the patch add an extra size of the array i.e. from 8 bytes (2 labels) to 64 bytes (16 labels) which is completely negligible compared to the size of a IP packet, and of same magnitude order as a VxLan encapsulation and twice less as an IPv6 header.
In addition, only the edge router as to push the label stack. Then, subsequent router just look at the top label. So, no more no less that a packet with only 2 labels in the stack.
I think that dealing with a dynamic MPLS STACK DEPTH i.e. dynamic memory allocation of space regarding the number of labels are push in from of the IP packet will certainly add more overhead and more CPU cycles rather than just manage a fix amount of byte in an array.
Finally, the default value for the MPLS_LABEL_STACK could be equal to 2 and let peoples want to deal with Segment Routing recompile the kernel with a larger value.
From my side, your patch will not only increase the label stack, it also re-arrange the MPLS structure in order to solve the problem of corrupted third label. And, this is the more important things.
In any case, let me know what I can do to help you.
Regards
Olivier
Le 16/03/2017 à 16:10, David Ahern a écrit :
I made the kernel patch to help you move along with your MPLS work.
In the kernel thread discussing the increase in number of labels Eric Biederman mentions performance concerns about just increasing the size of the array; he wanted a much more complicated change and I have not gotten around to it.
On 3/16/17 3:31 AM, Olivier Dugeon wrote:
Hi all,
The problem is not between the kernel and iproute2. The problem comes when stacking more than 2 labels: the LSB byte of the third label is replaced by 0x03 value i.e. the implicit Null Label value. See attached thread.
Following this, David propose a new patch that we integrated and tested successfully (See second and third attached mail).
But, after that, I got no news and never seen this patch merge into the Linux Kernel release. So, we just ask when this patch could be merge into the Linux Kernel.
For the second problem about PHP, it is also described in the first attached thread. But, agree, that this issue is more tricking to solve.
Regards,
Olivier
Le 15/03/2017 à 16:59, David Ahern a écrit :
On 3/15/17 9:57 AM, Amine Kherbouche wrote:
> 1) More than 2 labels in the kernel at a time, when will this be > allowed in the kernel? >
I don't understand why iproute2 and kernel are not sync when it comes to the max stacked labels, iproute2 should export kernel headers.
https://marc.info/?l=linux-netdev&m=146913202123729&w=2 <https://marc.info/?l=linux-netdev&m=146913202123729&w=2>
-- Amine _______________________________________________ frr mailing list frr@lists.nox.tf <mailto:frr@lists.nox.tf> https://lists.nox.tf/listinfo/frr <https://lists.nox.tf/listinfo/frr>
Hi Amine, 8 is fine for VPN, but not sufficient for Segment Routing see: http://ieeexplore.ieee.org/document/7778603/ The best for me, is to have the possibility to recompile the MPLS kernel module with the new value the MAX_LABEL_STACK and then let our Segment Routing implementation read this value to determine what's feasible. Regards Olivier Le 23/03/2017 à 17:59, Amine Kherbouche a écrit :
Hi David,
I think that 8 stacked labels are more than enough, I cannot imagine a core network with 8 stacked MPLS-VPNs. Let's see the others if they share my opinions.
Regards, Amine
On 23 March 2017 at 17:47, David Ahern <dsa@cumulusnetworks.com <mailto:dsa@cumulusnetworks.com>> wrote:
I have the MPLS code changes to bump the number of labels. What did we want to use for the max? 12? 16?
This limit is really only capping what we take from userspace. Kernel side memory allocations are done based on the actual number of labels in the route.
On 3/16/17 10:49 AM, Olivier Dugeon wrote: > Hi David, > > Well, frankly speaking, I don't see where is the problem regarding the > performance. > > IMHO, the patch add an extra size of the array i.e. from 8 bytes (2 > labels) to 64 bytes (16 labels) which is completely negligible compared > to the size of a IP packet, and of same magnitude order as a VxLan > encapsulation and twice less as an IPv6 header. > > In addition, only the edge router as to push the label stack. Then, > subsequent router just look at the top label. So, no more no less that a > packet with only 2 labels in the stack. > > I think that dealing with a dynamic MPLS STACK DEPTH i.e. dynamic memory > allocation of space regarding the number of labels are push in from of > the IP packet will certainly add more overhead and more CPU cycles > rather than just manage a fix amount of byte in an array. > > Finally, the default value for the MPLS_LABEL_STACK could be equal to 2 > and let peoples want to deal with Segment Routing recompile the kernel > with a larger value. > > From my side, your patch will not only increase the label stack, it also > re-arrange the MPLS structure in order to solve the problem of corrupted > third label. And, this is the more important things. > > In any case, let me know what I can do to help you. > > Regards > > Olivier > > > Le 16/03/2017 à 16:10, David Ahern a écrit : >> I made the kernel patch to help you move along with your MPLS work. >> >> In the kernel thread discussing the increase in number of labels Eric >> Biederman mentions performance concerns about just increasing the size >> of the array; he wanted a much more complicated change and I have not >> gotten around to it. >> >> >> On 3/16/17 3:31 AM, Olivier Dugeon wrote: >>> Hi all, >>> >>> The problem is not between the kernel and iproute2. The problem comes >>> when stacking more than 2 labels: the LSB byte of the third label is >>> replaced by 0x03 value i.e. the implicit Null Label value. See attached >>> thread. >>> >>> Following this, David propose a new patch that we integrated and tested >>> successfully (See second and third attached mail). >>> >>> But, after that, I got no news and never seen this patch merge into the >>> Linux Kernel release. So, we just ask when this patch could be merge >>> into the Linux Kernel. >>> >>> For the second problem about PHP, it is also described in the first >>> attached thread. But, agree, that this issue is more tricking to solve. >>> >>> Regards, >>> >>> Olivier >>> >>> >>> Le 15/03/2017 à 16:59, David Ahern a écrit : >>>> On 3/15/17 9:57 AM, Amine Kherbouche wrote: >>>>> > 1) More than 2 labels in the kernel at a time, when will this be >>>>> > allowed in the kernel? >>>>> > >>>>> >>>>> I don't understand why iproute2 and kernel are not sync when it comes to >>>>> the max stacked labels, iproute2 should export kernel headers. >>>>> >>>> https://marc.info/?l=linux-netdev&m=146913202123729&w=2 <https://marc.info/?l=linux-netdev&m=146913202123729&w=2> >>>> >
-- Amine
_________________________________________________________________________________________________________________________ Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. Thank you.
Olivier,
The best for me, is to have the possibility to recompile the MPLS kernel module with the new value the MAX_LABEL_STACK and then let our Segment Routing implementation read this value to determine what's feasible.
Yes but we still need a default value. You have also to see the perf impact, now the current mpls entry size in linux kernel is under a cache line, even using 12 stacked labels, we are always under a cache line but beyond this value we're going to see performance issue.
if there’s no performance impact - 8 is a safe choice Cheers, Jeff
On Mar 23, 2017, at 10:34, Amine Kherbouche <amine.kherbouche@6wind.com> wrote:
Olivier, The best for me, is to have the possibility to recompile the MPLS kernel module with the new value the MAX_LABEL_STACK and then let our Segment Routing implementation read this value to determine what's feasible.
Yes but we still need a default value. You have also to see the perf impact, now the current mpls entry size in linux kernel is under a cache line, even using 12 stacked labels, we are always under a cache line but beyond this value we're going to see performance issue.
_______________________________________________ frr mailing list frr@lists.nox.tf https://lists.nox.tf/listinfo/frr
On 3/23/17 11:34 AM, Amine Kherbouche wrote:
Yes but we still need a default value. You have also to see the perf impact, now the current mpls entry size in linux kernel is under a cache line, even using 12 stacked labels, we are always under a cache line but beyond this value we're going to see performance issue.
The implementation changes I have do not have a static allocation for labels. The max limit discussed here is an upper bound on what the kernel will take from userspace. Individual routes and nexthops allocate memory only for the max number of labels across all nexthops within a route: struct mpls_nh { /* next hop label forwarding entry */ struct net_device __rcu *nh_dev; unsigned int nh_flags; u8 nh_labels; u8 nh_via_alen; u8 nh_via_table; /* u8 hole */ u32 nh_label[0]; };
Amine, Sorry. I forgot to precise that default value set to 8 is fine. Regards Olivier Le 23/03/2017 à 18:34, Amine Kherbouche a écrit :
Olivier,
The best for me, is to have the possibility to recompile the MPLS kernel module with the new value the MAX_LABEL_STACK and then let our Segment Routing implementation read this value to determine what's feasible.
Yes but we still need a default value. You have also to see the perf impact, now the current mpls entry size in linux kernel is under a cache line, even using 12 stacked labels, we are always under a cache line but beyond this value we're going to see performance issue.
_________________________________________________________________________________________________________________________ Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. Thank you.
net-next tree has the patch set increasing number of labels. Allows up to 30 labels for ingress (ip->mpls) and routing (mpls->mpls). An mpls_route is limited to 4096 bytes (number of nexthops + labels); an lwt (ingress) route is under 128 bytes. With the latest set users of N-labels takes the performance hit of each additional label. On 3/23/17 1:34 PM, Amine Kherbouche wrote:
Olivier,
The best for me, is to have the possibility to recompile the MPLS kernel module with the new value the MAX_LABEL_STACK and then let our Segment Routing implementation read this value to determine what's feasible.
Yes but we still need a default value. You have also to see the perf impact, now the current mpls entry size in linux kernel is under a cache line, even using 12 stacked labels, we are always under a cache line but beyond this value we're going to see performance issue.
On 6 April 2017 at 15:36, David Ahern <dsa@cumulusnetworks.com> wrote:
net-next tree has the patch set increasing number of labels. Allows up to 30 labels for ingress (ip->mpls) and routing (mpls->mpls).
An mpls_route is limited to 4096 bytes (number of nexthops + labels); an lwt (ingress) route is under 128 bytes. With the latest set users of N-labels takes the performance hit of each additional label.
I think that having 30 stacked labels is useless, regardless of the performance issue, like Jeff said, the worst case covers 5 labels. I don't inderstand this waste.
On 3/23/17 1:34 PM, Amine Kherbouche wrote:
Olivier,
The best for me, is to have the possibility to recompile the MPLS kernel module with the new value the MAX_LABEL_STACK and then let our Segment Routing implementation read this value to determine what's feasible.
Yes but we still need a default value. You have also to see the perf impact, now the current mpls entry size in linux kernel is under a cache line, even using 12 stacked labels, we are always under a cache line but beyond this value we're going to see performance issue.
-- Amine
I'm not sure I would call it 'waste'. The solution, as I understand it, allocates the appopriate amount of memory to handle the # of labels handed to it with a *limit* of 4k bytes to the size that is alloc'ed internally to the kernel. This way they are keeping the data structure limit to the maximum size of a page on some platforms at worse case. donald On Thu, Apr 6, 2017 at 10:09 AM, Amine Kherbouche <amine.kherbouche@6wind.com> wrote:
On 6 April 2017 at 15:36, David Ahern <dsa@cumulusnetworks.com> wrote:
net-next tree has the patch set increasing number of labels. Allows up to 30 labels for ingress (ip->mpls) and routing (mpls->mpls).
An mpls_route is limited to 4096 bytes (number of nexthops + labels); an lwt (ingress) route is under 128 bytes. With the latest set users of N-labels takes the performance hit of each additional label.
I think that having 30 stacked labels is useless, regardless of the performance issue, like Jeff said, the worst case covers 5 labels. I don't inderstand this waste.
On 3/23/17 1:34 PM, Amine Kherbouche wrote:
Olivier,
The best for me, is to have the possibility to recompile the MPLS kernel module with the new value the MAX_LABEL_STACK and then let our Segment Routing implementation read this value to determine what's feasible.
Yes but we still need a default value. You have also to see the perf impact, now the current mpls entry size in linux kernel is under a cache line, even using 12 stacked labels, we are always under a cache line but beyond this value we're going to see performance issue.
-- Amine
On 4/6/17 10:12 AM, Donald Sharp wrote:
I'm not sure I would call it 'waste'. The solution, as I understand
There is no waste.
it, allocates the appopriate amount of memory to handle the # of labels handed to it with a *limit* of 4k bytes to the size that is alloc'ed internally to the kernel. This way they are keeping the data structure limit to the maximum size of a page on some platforms at worse case.
exactly.
Hi David, Le 06/04/2017 à 16:14, David Ahern a écrit :
On 4/6/17 10:12 AM, Donald Sharp wrote:
I'm not sure I would call it 'waste'. The solution, as I understand There is no waste.
it, allocates the appopriate amount of memory to handle the # of labels handed to it with a *limit* of 4k bytes to the size that is alloc'ed internally to the kernel. This way they are keeping the data structure limit to the maximum size of a page on some platforms at worse case. exactly. So, if I correctly understand, this means that we could manage for example 136 segment paths of 30 labels each, 256 segment paths of 16 labels each, 1024 segment paths of 4 labels, ... In other words, the sum of all label stacks of all segment paths must fit under the 4k bytes limit. I understand that we must put a limit, but 4k bytes is short. Very short. At the maximum it only authorizes 4k connexions with only one label i.e. only 4K LDP, RSVP-TE or MP-BGP paths.
So, is there a possibility to request another chunk of 4k bytes to increase the number of managed paths ? or this limit is hard coded at the compilation ? Regards Olivier
On 4/6/17 11:24 AM, Olivier Dugeon wrote:
Hi David,
Le 06/04/2017 à 16:14, David Ahern a écrit :
On 4/6/17 10:12 AM, Donald Sharp wrote:
I'm not sure I would call it 'waste'. The solution, as I understand There is no waste.
it, allocates the appopriate amount of memory to handle the # of labels handed to it with a *limit* of 4k bytes to the size that is alloc'ed internally to the kernel. This way they are keeping the data structure limit to the maximum size of a page on some platforms at worse case. exactly. So, if I correctly understand, this means that we could manage for example 136 segment paths of 30 labels each, 256 segment paths of 16 labels each, 1024 segment paths of 4 labels, ... In other words, the sum of all label stacks of all segment paths must fit under the 4k bytes limit. I understand that we must put a limit, but 4k bytes is short. Very short. At the maximum it only authorizes 4k connexions with only one label i.e. only 4K LDP, RSVP-TE or MP-BGP paths.
So, is there a possibility to request another chunk of 4k bytes to increase the number of managed paths ? or this limit is hard coded at the compilation ?
Before I respond, let me make sure I understand your question. The 4k limit is for a single route. A route is nexthops + labels per nexthop. for example: ip -f mpls route add 100 \ nexthop as to 101/102/.../130 via inet 172.16.1.1 \ nexthop as to 201/202/.../230 via inet 172.16.2.1 \ ... nexthop as to 3001/3002/.../3030 via inet 172.16.30.1 \ Are you saying you want more than 30ish nexthops in a single multipath route?
Hi David, Good news. Thanks for your effort. Le 06/04/2017 à 15:36, David Ahern a écrit :
net-next tree has the patch set increasing number of labels. Allows up to 30 labels for ingress (ip->mpls) and routing (mpls->mpls).
An mpls_route is limited to 4096 bytes (number of nexthops + labels); an lwt (ingress) route is under 128 bytes. With the latest set users of N-labels takes the performance hit of each additional label.
Well, I'm not sure to correctly understand. Is this means that we could manage for example 136 segment paths of 30 labels each, 256 segment paths of 16 labels each, 1024 segment paths of 4 labels, ... In other words, the sum of all label stacks of all segment paths must fit under the 4k bytes limit. I understand that we must put a limit, but 4k bytes is short. Very short. At the maximum it only authorizes 4k connexions with only one label i.e. only 4K LDP, RSVP-TE or MP-BGP paths. Or is this means that the total size of the packet + labels must not increase 4k bytes ? That's enough for standard 1500 bytes packets, but not sufficient for jumbo frames. Or is this means that the total information per mpls_route must not exceed 4k bytes ? In this case, it is largely enough as label stacks in segment routing replace the number of nexhops. I means that, then you describe an mpls_route in segment routing, the stack of label is the path and nexthops are the label. So, no needs to have both nexhops and labels. Can you precise which of my interpretation is correct, or if I'm completely wrong ? Regards Olivier
On 3/23/17 1:34 PM, Amine Kherbouche wrote:
Olivier,
The best for me, is to have the possibility to recompile the MPLS kernel module with the new value the MAX_LABEL_STACK and then let our Segment Routing implementation read this value to determine what's feasible.
Yes but we still need a default value. You have also to see the perf impact, now the current mpls entry size in linux kernel is under a cache line, even using 12 stacked labels, we are always under a cache line but beyond this value we're going to see performance issue.
On 4/6/17 11:33 AM, Olivier Dugeon wrote:
Well, I'm not sure to correctly understand.
Is this means that we could manage for example 136 segment paths of 30 labels each, 256 segment paths of 16 labels each, 1024 segment paths of 4 labels, ... In other words, the sum of all label stacks of all segment paths must fit under the 4k bytes limit. I understand that we must put a limit, but 4k bytes is short. Very short. At the maximum it only authorizes 4k connexions with only one label i.e. only 4K LDP, RSVP-TE or MP-BGP paths.
Or is this means that the total size of the packet + labels must not increase 4k bytes ? That's enough for standard 1500 bytes packets, but not sufficient for jumbo frames.
Or is this means that the total information per mpls_route must not exceed 4k bytes ? In this case, it is largely enough as label stacks in segment routing replace the number of nexhops. I means that, then you describe an mpls_route in segment routing, the stack of label is the path and nexthops are the label. So, no needs to have both nexhops and labels.
Can you precise which of my interpretation is correct, or if I'm completely wrong ?
The 4k limit is on a single mpls_route spec in the kernel. For example, a route like this: ip -f mpls route add 100 \ nexthop as to 101/102/.../130 via inet 172.16.1.1 \ nexthop as to 201/202/.../230 via inet 172.16.2.1 \ ... nexthop as to 3001/3002/.../3030 via inet 172.16.30.1 \ can only consume 4096 bytes in the kernel. 30'ish nexthops each with 30 labels all in a single MPLS route should be rather excessive - with both the number of labels and the number of nexthops. It has no correlation to packet size, number of routes, number of segment paths, etc.
Donald, Wrt PHP, this is incorrect, PHP node MUST not perform IP lookup, or in fact any lookup after POP. In most cases (labeled services, L2/L3 VPN) there's another label(s) in the stack, looking it up would be fatal. Regards, Jeff
On Mar 15, 2017, at 08:05, Donald Sharp <sharpd@cumulusnetworks.com> wrote:
David/Roopa -
Olivier asked me about these two issues yesterday in the FRR Technical Meeting. I just wanted to make sure I didn't loose track of these questions that he had:
1) More than 2 labels in the kernel at a time, when will this be allowed in the kernel?
-> David is currently working on this issue. When he is done it will be upstreamed. So soonish(tm).
2) PenUltimate Hop Popping:
I know this issue is not trivial to solve. In fact, once the POP instruction perform, the packet must re-enter in the IP packet processing to determine what action must apply. A possible solution would be to process this packet as a new incoming IP packet when output interface is the loopback disregarding the IP address value. But, this issue is less urgent than the first one. Our OSPF Segment Routing implementation could announce if the router works in PenUltimate Hop Poping mode or not. So, for the moment, the option is force to yes.
thanks!
donald
_______________________________________________ frr mailing list frr@lists.nox.tf https://lists.nox.tf/listinfo/frr
This is correct. By definition, if a router is the penultimate hop, it means the actual egress is downstream and has signaled (advertised) an implicit-null label to this router. The router doing the PHP knows the next hop to forward to (the egress) without doing any additional lookup. This behavior should already be supported. What is not supported (if I remember right) is the ability on the egress to terminate a label and perform a (route) lookup. That is needed to really be able to support any L2/L3 VPN service properly. On Thu, Mar 16, 2017 at 10:45 AM, Jeff Tantsura <jefftant@gmail.com> wrote:
Donald,
Wrt PHP, this is incorrect, PHP node MUST not perform IP lookup, or in fact any lookup after POP. In most cases (labeled services, L2/L3 VPN) there's another label(s) in the stack, looking it up would be fatal.
Regards, Jeff
On Mar 15, 2017, at 08:05, Donald Sharp <sharpd@cumulusnetworks.com> wrote:
David/Roopa -
Olivier asked me about these two issues yesterday in the FRR Technical Meeting. I just wanted to make sure I didn't loose track of these questions that he had:
1) More than 2 labels in the kernel at a time, when will this be allowed in the kernel?
-> David is currently working on this issue. When he is done it will be upstreamed. So soonish(tm).
2) PenUltimate Hop Popping:
I know this issue is not trivial to solve. In fact, once the POP instruction perform, the packet must re-enter in the IP packet processing to determine what action must apply. A possible solution would be to process this packet as a new incoming IP packet when output interface is the loopback disregarding the IP address value. But, this issue is less urgent than the first one. Our OSPF Segment Routing implementation could announce if the router works in PenUltimate Hop Poping mode or not. So, for the moment, the option is force to yes.
thanks!
donald
_______________________________________________ frr mailing list frr@lists.nox.tf https://lists.nox.tf/listinfo/frr
_______________________________________________ frr mailing list frr@lists.nox.tf https://lists.nox.tf/listinfo/frr
On 3/17/17, 10:19 PM, Vivek Venkatraman wrote:
This is correct. By definition, if a router is the penultimate hop, it means the actual egress is downstream and has signaled (advertised) an implicit-null label to this router. The router doing the PHP knows the next hop to forward to (the egress) without doing any additional lookup.
This behavior should already be supported.
What is not supported (if I remember right) is the ability on the egress to terminate a label and perform a (route) lookup. That is needed to really be able to support any L2/L3 VPN service properly.
yes, correct. terminating a label and performing a route lookup is not supported today.
On Thu, Mar 16, 2017 at 10:45 AM, Jeff Tantsura <jefftant@gmail.com> wrote:
Donald,
Wrt PHP, this is incorrect, PHP node MUST not perform IP lookup, or in fact any lookup after POP. In most cases (labeled services, L2/L3 VPN) there's another label(s) in the stack, looking it up would be fatal.
Regards, Jeff
On Mar 15, 2017, at 08:05, Donald Sharp <sharpd@cumulusnetworks.com> wrote: David/Roopa -
Olivier asked me about these two issues yesterday in the FRR Technical Meeting. I just wanted to make sure I didn't loose track of these questions that he had:
1) More than 2 labels in the kernel at a time, when will this be allowed in the kernel?
-> David is currently working on this issue. When he is done it will be upstreamed. So soonish(tm).
2) PenUltimate Hop Popping:
I know this issue is not trivial to solve. In fact, once the POP instruction perform, the packet must re-enter in the IP packet processing to determine what action must apply. A possible solution would be to process this packet as a new incoming IP packet when output interface is the loopback disregarding the IP address value. But, this issue is less urgent than the first one. Our OSPF Segment Routing implementation could announce if the router works in PenUltimate Hop Poping mode or not. So, for the moment, the option is force to yes.
thanks!
donald
_______________________________________________ frr mailing list frr@lists.nox.tf https://lists.nox.tf/listinfo/frr
frr mailing list frr@lists.nox.tf https://lists.nox.tf/listinfo/frr
That's only needed with label per VRF mode, and if I recall correctly in currently supported/discussed label per prefix mode where POP is followed by IP lookup in VRF context. IMHO - would be really bad (why repeat bad early IOS choices?) The most optimal case for FRR would be label per NH allocation, where label lookup should yield fully resolved adj, withno need for additional lookup. It would also support the case with >1 NH's in a VRF (multihoming). There are some corner cases, like EIBGP LB + BGP FRR, however we are rather far away from that point in life... Regards, Jeff
On Mar 18, 2017, at 00:27, Roopa Prabhu <roopa@cumulusnetworks.com> wrote:
On 3/17/17, 10:19 PM, Vivek Venkatraman wrote: This is correct. By definition, if a router is the penultimate hop, it means the actual egress is downstream and has signaled (advertised) an implicit-null label to this router. The router doing the PHP knows the next hop to forward to (the egress) without doing any additional lookup.
This behavior should already be supported.
What is not supported (if I remember right) is the ability on the egress to terminate a label and perform a (route) lookup. That is needed to really be able to support any L2/L3 VPN service properly.
yes, correct. terminating a label and performing a route lookup is not supported today.
On Thu, Mar 16, 2017 at 10:45 AM, Jeff Tantsura <jefftant@gmail.com> wrote:
Donald,
Wrt PHP, this is incorrect, PHP node MUST not perform IP lookup, or in fact any lookup after POP. In most cases (labeled services, L2/L3 VPN) there's another label(s) in the stack, looking it up would be fatal.
Regards, Jeff
On Mar 15, 2017, at 08:05, Donald Sharp <sharpd@cumulusnetworks.com> wrote: David/Roopa -
Olivier asked me about these two issues yesterday in the FRR Technical Meeting. I just wanted to make sure I didn't loose track of these questions that he had:
1) More than 2 labels in the kernel at a time, when will this be allowed in the kernel?
-> David is currently working on this issue. When he is done it will be upstreamed. So soonish(tm).
2) PenUltimate Hop Popping:
I know this issue is not trivial to solve. In fact, once the POP instruction perform, the packet must re-enter in the IP packet processing to determine what action must apply. A possible solution would be to process this packet as a new incoming IP packet when output interface is the loopback disregarding the IP address value. But, this issue is less urgent than the first one. Our OSPF Segment Routing implementation could announce if the router works in PenUltimate Hop Poping mode or not. So, for the moment, the option is force to yes.
thanks!
donald
_______________________________________________ frr mailing list frr@lists.nox.tf https://lists.nox.tf/listinfo/frr
frr mailing list frr@lists.nox.tf https://lists.nox.tf/listinfo/frr
On Sat, Mar 18, 2017 at 7:27 AM, Jeff Tantsura <jefftant@gmail.com> wrote:
That's only needed with label per VRF mode, and if I recall correctly in currently supported/discussed label per prefix mode where POP is followed by IP lookup in VRF context. IMHO - would be really bad (why repeat bad early IOS choices?) The most optimal case for FRR would be label per NH allocation, where label lookup should yield fully resolved adj, withno need for additional lookup. It would also support the case with >1 NH's in a VRF (multihoming).
There are some corner cases, like EIBGP LB + BGP FRR, however we are rather far away from that point in life...
Yes, pop+route lookup should not be mandatory for supporting l2/l3 vpns. But apparently Olivier needs this for his Segment Routing implementation, so I can't see why not support this in the Linux kernel. OpenBSD for instance allows pop+route lookup (OpenBGPD's l3vpn implementation relies on that because it allocates one label per VRF/rdomain). -- Renato Westphal
Yes, the pop and forward to next hop satisfies many of the common cases. However, the pop+lookup in VRF would be needed to reach the PE itself, when the PE is aggregating customer routes which are across next hops etc., plus the cases you mention. I'd like the kernel to incorporate support for it, but agree that the lack of it doesn't significantly limit usability. On Sat, Mar 18, 2017 at 3:27 AM, Jeff Tantsura <jefftant@gmail.com> wrote:
That's only needed with label per VRF mode, and if I recall correctly in currently supported/discussed label per prefix mode where POP is followed by IP lookup in VRF context. IMHO - would be really bad (why repeat bad early IOS choices?) The most optimal case for FRR would be label per NH allocation, where label lookup should yield fully resolved adj, withno need for additional lookup. It would also support the case with >1 NH's in a VRF (multihoming).
There are some corner cases, like EIBGP LB + BGP FRR, however we are rather far away from that point in life...
Regards, Jeff
On Mar 18, 2017, at 00:27, Roopa Prabhu <roopa@cumulusnetworks.com> wrote:
On 3/17/17, 10:19 PM, Vivek Venkatraman wrote: This is correct. By definition, if a router is the penultimate hop, it means the actual egress is downstream and has signaled (advertised) an implicit-null label to this router. The router doing the PHP knows the next hop to forward to (the egress) without doing any additional lookup.
This behavior should already be supported.
What is not supported (if I remember right) is the ability on the egress to terminate a label and perform a (route) lookup. That is needed to really be able to support any L2/L3 VPN service properly.
yes, correct. terminating a label and performing a route lookup is not supported today.
On Thu, Mar 16, 2017 at 10:45 AM, Jeff Tantsura <jefftant@gmail.com>
wrote:
Donald,
Wrt PHP, this is incorrect, PHP node MUST not perform IP lookup, or in fact any lookup after POP. In most cases (labeled services, L2/L3 VPN) there's another label(s) in the stack, looking it up would be fatal.
Regards, Jeff
On Mar 15, 2017, at 08:05, Donald Sharp <sharpd@cumulusnetworks.com> wrote: David/Roopa -
Olivier asked me about these two issues yesterday in the FRR Technical Meeting. I just wanted to make sure I didn't loose track of these questions that he had:
1) More than 2 labels in the kernel at a time, when will this be allowed in the kernel?
-> David is currently working on this issue. When he is done it will be upstreamed. So soonish(tm).
2) PenUltimate Hop Popping:
I know this issue is not trivial to solve. In fact, once the POP instruction perform, the packet must re-enter in the IP packet processing to determine what action must apply. A possible solution would be to process this packet as a new incoming IP packet when output interface is the loopback disregarding the IP address value. But, this issue is less urgent than the first one. Our OSPF Segment Routing implementation could announce if the router works in PenUltimate Hop Poping mode or not. So, for the moment, the option is force to yes.
thanks!
donald
_______________________________________________ frr mailing list frr@lists.nox.tf https://lists.nox.tf/listinfo/frr
frr mailing list frr@lists.nox.tf https://lists.nox.tf/listinfo/frr
On Sat, Mar 18, 2017 at 2:27 AM, Roopa Prabhu <roopa@cumulusnetworks.com> wrote:
On 3/17/17, 10:19 PM, Vivek Venkatraman wrote:
This is correct. By definition, if a router is the penultimate hop, it means the actual egress is downstream and has signaled (advertised) an implicit-null label to this router. The router doing the PHP knows the next hop to forward to (the egress) without doing any additional lookup.
This behavior should already be supported.
What is not supported (if I remember right) is the ability on the egress to terminate a label and perform a (route) lookup. That is needed to really be able to support any L2/L3 VPN service properly.
yes, correct. terminating a label and performing a route lookup is not supported today.
The exception being the ipv4/ipv6 explicit null labels. Please take a look: https://github.com/torvalds/linux/blob/master/net/mpls/af_mpls.c#L1872 -- Renato Westphal
On Sat, Mar 18, 2017 at 9:45 AM, Renato Westphal <renato@opensourcerouting.org> wrote:
On Sat, Mar 18, 2017 at 2:27 AM, Roopa Prabhu <roopa@cumulusnetworks.com> wrote:
On 3/17/17, 10:19 PM, Vivek Venkatraman wrote:
This is correct. By definition, if a router is the penultimate hop, it means the actual egress is downstream and has signaled (advertised) an implicit-null label to this router. The router doing the PHP knows the next hop to forward to (the egress) without doing any additional lookup.
This behavior should already be supported.
What is not supported (if I remember right) is the ability on the egress to terminate a label and perform a (route) lookup. That is needed to really be able to support any L2/L3 VPN service properly.
yes, correct. terminating a label and performing a route lookup is not supported today.
The exception being the ipv4/ipv6 explicit null labels. Please take a look: https://github.com/torvalds/linux/blob/master/net/mpls/af_mpls.c#L1872
Correction.. the explicit-null labels don't seem to work on Linux. Using ldpd, if I configure the "label local advertise explicit-null" command in all nodes of a network, all LDP sessions flap every 3 minutes (the default holdtime). I can't even ping a directly attached Linux LSR if I push an explicit-null label. It's like the kernel just drops all incoming packets that have an explicit-null label. I'm using an relatively old kernel (v4.4.0) so maybe this was fixed already, will check later. -- Renato Westphal
On Sat, Mar 18, 2017 at 10:23 AM, Renato Westphal <renato@opensourcerouting.org> wrote:
On Sat, Mar 18, 2017 at 9:45 AM, Renato Westphal <renato@opensourcerouting.org> wrote:
On Sat, Mar 18, 2017 at 2:27 AM, Roopa Prabhu <roopa@cumulusnetworks.com> wrote:
On 3/17/17, 10:19 PM, Vivek Venkatraman wrote:
This is correct. By definition, if a router is the penultimate hop, it means the actual egress is downstream and has signaled (advertised) an implicit-null label to this router. The router doing the PHP knows the next hop to forward to (the egress) without doing any additional lookup.
This behavior should already be supported.
What is not supported (if I remember right) is the ability on the egress to terminate a label and perform a (route) lookup. That is needed to really be able to support any L2/L3 VPN service properly.
yes, correct. terminating a label and performing a route lookup is not supported today.
The exception being the ipv4/ipv6 explicit null labels. Please take a look: https://github.com/torvalds/linux/blob/master/net/mpls/af_mpls.c#L1872
Correction.. the explicit-null labels don't seem to work on Linux.
Using ldpd, if I configure the "label local advertise explicit-null" command in all nodes of a network, all LDP sessions flap every 3 minutes (the default holdtime).
I can't even ping a directly attached Linux LSR if I push an explicit-null label. It's like the kernel just drops all incoming packets that have an explicit-null label.
I'm using an relatively old kernel (v4.4.0) so maybe this was fixed already, will check later.
Hi all, Finally found some time to check this out and my conclusion is that there's nothing wrong in the Linux kernel w.r.t to the handling of explicit null labels. In other words, upon receipt of an explicit null label the kernel is capable of popping it and performing a route lookup. I did some testing using several different kernel versions, ranging from v4.4 to v4.11-rc4, and found no problems at all. The only exception is when I'm using Ubuntu to perform the tests. On Ubuntu, irrespective of the kernel version I'm using, incoming packets labeled with the IPv4 null label are just dropped (the IPv6 null label works fine though). I couldn't find out why this happens. The output of "ip link afstats" indicates that the packets are not dropped in the MPLS stack, but somehow fail in the route lookup. I believe that this might be related to some obscure sysctl that is enabled on Ubuntu by default. Using Debian I don't have the same problem. Regarding the pop+route lookups that Olivier is interested in, I discovered something interesting. I commented out the following two lines and rebuilt the kernel: https://github.com/torvalds/linux/blob/v4.11-rc4/net/mpls/af_mpls.c#L1778-L1... Then I looked at the MPLS routing table: # ip -M ro 0 via link 00:00:00:00:00:00 dev lo proto kernel 2 via link 00:00:00:00:00:00 dev lo proto kernel Pretty interesting huh? Then I discovered that adding LSPs like this is enough to perform pop + route lookup: # ip -M route add 16 dev lo I would like to hear from our kernel developers if we can consider this as a solution to Olivier's 2nd problem or if it's just a workaround. Cheers. -- Renato Westphal
On 4/3/17 8:00 PM, Renato Westphal wrote:
Regarding the pop+route lookups that Olivier is interested in, I discovered something interesting. I commented out the following two lines and rebuilt the kernel: https://github.com/torvalds/linux/blob/v4.11-rc4/net/mpls/af_mpls.c#L1778-L1...
Then I looked at the MPLS routing table: # ip -M ro 0 via link 00:00:00:00:00:00 dev lo proto kernel 2 via link 00:00:00:00:00:00 dev lo proto kernel
Pretty interesting huh?
Then I discovered that adding LSPs like this is enough to perform pop + route lookup: # ip -M route add 16 dev lo
I would like to hear from our kernel developers if we can consider this as a solution to Olivier's 2nd problem or if it's just a workaround.
I'll take a look next week; at neconf/netdev this week.
On 4/3/17 6:00 PM, Renato Westphal wrote:
Then I discovered that adding LSPs like this is enough to perform pop + route lookup: # ip -M route add 16 dev lo
I would like to hear from our kernel developers if we can consider this as a solution to Olivier's 2nd problem or if it's just a workaround.
Yes, that works just fine for both 'lo' and VRF devices. I see fib lookups happening in the right table for each, yes, I think this is an acceptable 'pop and lookup' solution.
+Thomas to be on track. Le 18 mars 2017 06:19:47 Vivek Venkatraman <vivek@cumulusnetworks.com> a écrit :
This is correct. By definition, if a router is the penultimate hop, it means the actual egress is downstream and has signaled (advertised) an implicit-null label to this router. The router doing the PHP knows the next hop to forward to (the egress) without doing any additional lookup.
This behavior should already be supported.
What is not supported (if I remember right) is the ability on the egress to terminate a label and perform a (route) lookup. That is needed to really be able to support any L2/L3 VPN service properly.
On Thu, Mar 16, 2017 at 10:45 AM, Jeff Tantsura <jefftant@gmail.com> wrote:
Donald,
Wrt PHP, this is incorrect, PHP node MUST not perform IP lookup, or in fact any lookup after POP. In most cases (labeled services, L2/L3 VPN) there's another label(s) in the stack, looking it up would be fatal.
Regards, Jeff
On Mar 15, 2017, at 08:05, Donald Sharp <sharpd@cumulusnetworks.com> wrote:
David/Roopa -
Olivier asked me about these two issues yesterday in the FRR Technical Meeting. I just wanted to make sure I didn't loose track of these questions that he had:
1) More than 2 labels in the kernel at a time, when will this be allowed in the kernel?
-> David is currently working on this issue. When he is done it will be upstreamed. So soonish(tm).
2) PenUltimate Hop Popping:
I know this issue is not trivial to solve. In fact, once the POP instruction perform, the packet must re-enter in the IP packet processing to determine what action must apply. A possible solution would be to process this packet as a new incoming IP packet when output interface is the loopback disregarding the IP address value. But, this issue is less urgent than the first one. Our OSPF Segment Routing implementation could announce if the router works in PenUltimate Hop Poping mode or not. So, for the moment, the option is force to yes.
thanks!
donald
_______________________________________________ frr mailing list frr@lists.nox.tf https://lists.nox.tf/listinfo/frr
_______________________________________________ frr mailing list frr@lists.nox.tf https://lists.nox.tf/listinfo/frr
---------- _______________________________________________ frr mailing list frr@lists.nox.tf https://lists.nox.tf/listinfo/frr
Hi everyone, 2017-03-18, Vincent Jardin:
+Thomas to be on track.
Vincent, you'll tell if what is below helped or not :)
Le 18 mars 2017 06:19:47 Vivek Venkatraman <vivek@cumulusnetworks.com> a écrit :
This is correct. By definition, if a router is the penultimate hop, it means the actual egress is downstream and has signaled (advertised) an implicit-null label to this router. The router doing the PHP knows the next hop to forward to (the egress) without doing any additional lookup.
(Note that with BGP/MPLS VPNs this is the typical behavior, but it is not a mandatory behavior: the egress router may have advertise a real label (i.e. not implicit null) in which case the penultimate router will swap the topmost label of the stack, not seeing/touching the vpn label. This is also a behavior that relates to the use of MPLS for transit, but with MPLS-over-GRE or MPLS-over-UDP, MPLS can be used with IP transit, in which case this behavior is not used).
This behavior should already be supported.
Yes, I can confirm that forwarding via a neighbor on an interface based on the incoming MPLS label is supported. This is what we use in bagpipe IP VPN 'linux' driver [1].
What is not supported (if I remember right) is the ability on the egress to terminate a label and perform a (route) lookup. That is needed to really be able to support any L2/L3 VPN service properly.
I think it is a requirement to have something efficient to trigger a lookup in any {routing table, vrf interface, netns}. I hadn't tried (because no need). I thought we might achieve something like that by forwarding the packet on 'lo', or on a vrf interface, or on a veth device: wouldn't this kind of next hop specification trigger a re-enter of the packet in the IP stack after the pop operation ? -Thomas [1] http://git.openstack.org/cgit/openstack/networking-bagpipe/tree/networking_b...
On Thu, Mar 16, 2017 at 10:45 AM, Jeff Tantsura <jefftant@gmail.com <mailto:jefftant@gmail.com>> wrote:
Donald,
Wrt PHP, this is incorrect, PHP node MUST not perform IP lookup, or in fact any lookup after POP. In most cases (labeled services, L2/L3 VPN) there's another label(s) in the stack, looking it up would be fatal.
Regards, Jeff
> On Mar 15, 2017, at 08:05, Donald Sharp <sharpd@cumulusnetworks.com <mailto:sharpd@cumulusnetworks.com>> wrote: > > David/Roopa - > > Olivier asked me about these two issues yesterday in the FRR Technical > Meeting. I just wanted to make sure I didn't loose track of these > questions that he had: > > 1) More than 2 labels in the kernel at a time, when will this be > allowed in the kernel? > > -> David is currently working on this issue. When he is done it > will be upstreamed. So soonish(tm). > > 2) PenUltimate Hop Popping: > > I know this issue is not trivial to solve. In fact, once the POP > instruction perform, the packet must re-enter in the IP packet > processing to determine what action must apply. A possible solution > would be to process this packet as a new incoming IP packet when > output interface is the loopback disregarding the IP address value. > But, this issue is less urgent than the first one. Our OSPF Segment > Routing implementation could announce if the router works in > PenUltimate Hop Poping mode or not. So, for the moment, the option is > force to yes. > > thanks! > > donald > > _______________________________________________ > frr mailing list > frr@lists.nox.tf <mailto:frr@lists.nox.tf> > https://lists.nox.tf/listinfo/frr <https://lists.nox.tf/listinfo/frr>
_______________________________________________ frr mailing list frr@lists.nox.tf <mailto:frr@lists.nox.tf> https://lists.nox.tf/listinfo/frr <https://lists.nox.tf/listinfo/frr>
_______________________________________________ frr mailing list frr@lists.nox.tf <mailto:frr%40lists.nox.tf> https://lists.nox.tf/listinfo/frr
Hey Thomas, please see inline Cheers, Jeff
On Mar 20, 2017, at 09:38, Thomas Morin <thomas.morin@orange.com> wrote:
Hi everyone,
2017-03-18, Vincent Jardin:
+Thomas to be on track.
Vincent, you'll tell if what is below helped or not :)
Le 18 mars 2017 06:19:47 Vivek Venkatraman <vivek@cumulusnetworks.com> <mailto:vivek@cumulusnetworks.com> a écrit :
This is correct. By definition, if a router is the penultimate hop, it means the actual egress is downstream and has signaled (advertised) an implicit-null label to this router. The router doing the PHP knows the next hop to forward to (the egress) without doing any additional lookup.
(Note that with BGP/MPLS VPNs this is the typical behavior, but it is not a mandatory behavior: the egress router may have advertise a real label (i.e. not implicit null) in which case the penultimate router will swap the topmost label of the stack, not seeing/touching the vpn label. This is also a behavior that relates to the use of MPLS for transit, but with MPLS-over-GRE or MPLS-over-UDP, MPLS can be used with IP transit, in which case this behavior is not used). [jeff] to better phrase - no router, beside the one allocating service labels could lookup/touch them, this is true for PHP or any other case. In PHP case, after most outer label has been looked up and a fully resolved adj provided, the rest of the label stack MUST not be looked up (payload) In SR case, after Adj-SID has been POPed, the packed must be sent out of the interface associated with the Adj, with no any additional processing
This behavior should already be supported.
Yes, I can confirm that forwarding via a neighbor on an interface based on the incoming MPLS label is supported. This is what we use in bagpipe IP VPN 'linux' driver [1].
What is not supported (if I remember right) is the ability on the egress to terminate a label and perform a (route) lookup. That is needed to really be able to support any L2/L3 VPN service properly.
I think it is a requirement to have something efficient to trigger a lookup in any {routing table, vrf interface, netns}.
I hadn't tried (because no need). I thought we might achieve something like that by forwarding the packet on 'lo', or on a vrf interface, or on a veth device: wouldn't this kind of next hop specification trigger a re-enter of the packet in the IP stack after the pop operation ? [jeff] would’t this be a tad inefficient? :)
-Thomas
[1] http://git.openstack.org/cgit/openstack/networking-bagpipe/tree/networking_b... <http://git.openstack.org/cgit/openstack/networking-bagpipe/tree/networking_bagpipe/bagpipe_bgp/vpn/ipvpn/mpls_linux_dataplane.py#n194>
On Thu, Mar 16, 2017 at 10:45 AM, Jeff Tantsura <jefftant@gmail.com <mailto:jefftant@gmail.com>> wrote: Donald,
Wrt PHP, this is incorrect, PHP node MUST not perform IP lookup, or in fact any lookup after POP. In most cases (labeled services, L2/L3 VPN) there's another label(s) in the stack, looking it up would be fatal.
Regards, Jeff
On Mar 15, 2017, at 08:05, Donald Sharp <sharpd@cumulusnetworks.com <mailto:sharpd@cumulusnetworks.com>> wrote:
David/Roopa -
Olivier asked me about these two issues yesterday in the FRR Technical Meeting. I just wanted to make sure I didn't loose track of these questions that he had:
1) More than 2 labels in the kernel at a time, when will this be allowed in the kernel?
-> David is currently working on this issue. When he is done it will be upstreamed. So soonish(tm).
2) PenUltimate Hop Popping:
I know this issue is not trivial to solve. In fact, once the POP instruction perform, the packet must re-enter in the IP packet processing to determine what action must apply. A possible solution would be to process this packet as a new incoming IP packet when output interface is the loopback disregarding the IP address value. But, this issue is less urgent than the first one. Our OSPF Segment Routing implementation could announce if the router works in PenUltimate Hop Poping mode or not. So, for the moment, the option is force to yes.
thanks!
donald
_______________________________________________ frr mailing list frr@lists.nox.tf <mailto:frr@lists.nox.tf> https://lists.nox.tf/listinfo/frr <https://lists.nox.tf/listinfo/frr>
_______________________________________________ frr mailing list frr@lists.nox.tf <mailto:frr@lists.nox.tf> https://lists.nox.tf/listinfo/frr <https://lists.nox.tf/listinfo/frr>
_______________________________________________ frr mailing list frr@lists.nox.tf <mailto:frr%40lists.nox.tf> https://lists.nox.tf/listinfo/frr <https://lists.nox.tf/listinfo/frr>
Hi Jeff, 2017-03-20, Jeff Tantsura:
I think it is a requirement to have something efficient to trigger a lookup in any {routing table, vrf interface, netns}.
I hadn't tried (because no need). I thought we might achieve something like that by forwarding the packet on 'lo', or on a vrf interface, or on a veth device: wouldn't this kind of next hop specification trigger a re-enter of the packet in the IP stack after the pop operation ?
[jeff] would’t this be a tad inefficient? :)
Well, I was not implying that the above would be efficient, and I would actually have the same concern as you have. But to be honest I also lack hard facts to back this concern: e.g. I don't know whether going through a vrf interface is a small or high cost. Best, -Thomas
On Thu, Mar 16, 2017 at 10:45 AM, Jeff Tantsura <jefftant@gmail.com <mailto:jefftant@gmail.com>> wrote:
Donald,
Wrt PHP, this is incorrect, PHP node MUST not perform IP lookup, or in fact any lookup after POP. In most cases (labeled services, L2/L3 VPN) there's another label(s) in the stack, looking it up would be fatal.
Regards, Jeff
> On Mar 15, 2017, at 08:05, Donald Sharp <sharpd@cumulusnetworks.com <mailto:sharpd@cumulusnetworks.com>> wrote: > > David/Roopa - > > Olivier asked me about these two issues yesterday in the FRR Technical > Meeting. I just wanted to make sure I didn't loose track of these > questions that he had: > > 1) More than 2 labels in the kernel at a time, when will this be > allowed in the kernel? > > -> David is currently working on this issue. When he is done it > will be upstreamed. So soonish(tm). > > 2) PenUltimate Hop Popping: > > I know this issue is not trivial to solve. In fact, once the POP > instruction perform, the packet must re-enter in the IP packet > processing to determine what action must apply. A possible solution > would be to process this packet as a new incoming IP packet when > output interface is the loopback disregarding the IP address value. > But, this issue is less urgent than the first one. Our OSPF Segment > Routing implementation could announce if the router works in > PenUltimate Hop Poping mode or not. So, for the moment, the option is > force to yes. > > thanks! > > donald > > _______________________________________________ > frr mailing list > frr@lists.nox.tf <mailto:frr@lists.nox.tf> > https://lists.nox.tf/listinfo/frr <https://lists.nox.tf/listinfo/frr>
_______________________________________________ frr mailing list frr@lists.nox.tf <mailto:frr@lists.nox.tf> https://lists.nox.tf/listinfo/frr <https://lists.nox.tf/listinfo/frr>
_______________________________________________ frr mailing list frr@lists.nox.tf <mailto:frr%40lists.nox.tf> https://lists.nox.tf/listinfo/frr
Thanks Thomas, Before we write code, let’s agree on architecture, there are always tradeoffs, pros/cons analysis on such an important topic would be in place Could someone please update me wrt SRv6 (SRH) - what’s supported/what’s planned? Thanks! Cheers, Jeff
On Mar 21, 2017, at 02:47, Thomas Morin <thomas.morin@orange.com> wrote:
Hi Jeff,
2017-03-20, Jeff Tantsura:
I think it is a requirement to have something efficient to trigger a lookup in any {routing table, vrf interface, netns}.
I hadn't tried (because no need). I thought we might achieve something like that by forwarding the packet on 'lo', or on a vrf interface, or on a veth device: wouldn't this kind of next hop specification trigger a re-enter of the packet in the IP stack after the pop operation ?
[jeff] would’t this be a tad inefficient? :)
Well, I was not implying that the above would be efficient, and I would actually have the same concern as you have. But to be honest I also lack hard facts to back this concern: e.g. I don't know whether going through a vrf interface is a small or high cost.
Best,
-Thomas
On Thu, Mar 16, 2017 at 10:45 AM, Jeff Tantsura <jefftant@gmail.com <mailto:jefftant@gmail.com>> wrote: Donald,
Wrt PHP, this is incorrect, PHP node MUST not perform IP lookup, or in fact any lookup after POP. In most cases (labeled services, L2/L3 VPN) there's another label(s) in the stack, looking it up would be fatal.
Regards, Jeff
On Mar 15, 2017, at 08:05, Donald Sharp <sharpd@cumulusnetworks.com <mailto:sharpd@cumulusnetworks.com>> wrote:
David/Roopa -
Olivier asked me about these two issues yesterday in the FRR Technical Meeting. I just wanted to make sure I didn't loose track of these questions that he had:
1) More than 2 labels in the kernel at a time, when will this be allowed in the kernel?
-> David is currently working on this issue. When he is done it will be upstreamed. So soonish(tm).
2) PenUltimate Hop Popping:
I know this issue is not trivial to solve. In fact, once the POP instruction perform, the packet must re-enter in the IP packet processing to determine what action must apply. A possible solution would be to process this packet as a new incoming IP packet when output interface is the loopback disregarding the IP address value. But, this issue is less urgent than the first one. Our OSPF Segment Routing implementation could announce if the router works in PenUltimate Hop Poping mode or not. So, for the moment, the option is force to yes.
thanks!
donald
_______________________________________________ frr mailing list frr@lists.nox.tf <mailto:frr@lists.nox.tf> https://lists.nox.tf/listinfo/frr <https://lists.nox.tf/listinfo/frr>
_______________________________________________ frr mailing list frr@lists.nox.tf <mailto:frr@lists.nox.tf> https://lists.nox.tf/listinfo/frr <https://lists.nox.tf/listinfo/frr>
_______________________________________________ frr mailing list frr@lists.nox.tf <mailto:frr%40lists.nox.tf> https://lists.nox.tf/listinfo/frr <https://lists.nox.tf/listinfo/frr>
Hi all, Kind of an aside question about MPLS implementation in FRR; On Mon, Mar 20, 2017 at 5:38 PM, Thomas Morin <thomas.morin@orange.com> wrote:
Hi everyone,
2017-03-18, Vincent Jardin:
+Thomas to be on track.
Vincent, you'll tell if what is below helped or not :)
Le 18 mars 2017 06:19:47 Vivek Venkatraman <vivek@cumulusnetworks.com> <vivek@cumulusnetworks.com> a écrit :
This is correct. By definition, if a router is the penultimate hop, it means the actual egress is downstream and has signaled (advertised) an implicit-null label to this router. The router doing the PHP knows the next hop to forward to (the egress) without doing any additional lookup.
(Note that with BGP/MPLS VPNs this is the typical behavior, but it is not a mandatory behavior: the egress router may have advertise a real label (i.e. not implicit null) in which case the penultimate router will swap the topmost label of the stack, not seeing/touching the vpn label. This is also a behavior that relates to the use of MPLS for transit, but with MPLS-over-GRE or MPLS-over-UDP, MPLS can be used with IP transit, in which case this behavior is not used).
In regular MPLS processing (no SR), does FRR currently support advertising any label other than implicit NULL for MPLS traffic termination(itself)? And for L2VPNs? thanks Marc
This behavior should already be supported.
Yes, I can confirm that forwarding via a neighbor on an interface based on the incoming MPLS label is supported. This is what we use in bagpipe IP VPN 'linux' driver [1].
What is not supported (if I remember right) is the ability on the egress to terminate a label and perform a (route) lookup. That is needed to really be able to support any L2/L3 VPN service properly.
I think it is a requirement to have something efficient to trigger a lookup in any {routing table, vrf interface, netns}.
I hadn't tried (because no need). I thought we might achieve something like that by forwarding the packet on 'lo', or on a vrf interface, or on a veth device: wouldn't this kind of next hop specification trigger a re-enter of the packet in the IP stack after the pop operation ?
-Thomas
[1] http://git.openstack.org/cgit/openstack/networking-bagpipe/t ree/networking_bagpipe/bagpipe_bgp/vpn/ipvpn/mpls_linux_dataplane.py#n194
On Thu, Mar 16, 2017 at 10:45 AM, Jeff Tantsura <jefftant@gmail.com> wrote:
Donald,
Wrt PHP, this is incorrect, PHP node MUST not perform IP lookup, or in fact any lookup after POP. In most cases (labeled services, L2/L3 VPN) there's another label(s) in the stack, looking it up would be fatal.
Regards, Jeff
On Mar 15, 2017, at 08:05, Donald Sharp <sharpd@cumulusnetworks.com> wrote:
David/Roopa -
Olivier asked me about these two issues yesterday in the FRR Technical Meeting. I just wanted to make sure I didn't loose track of these questions that he had:
1) More than 2 labels in the kernel at a time, when will this be allowed in the kernel?
-> David is currently working on this issue. When he is done it will be upstreamed. So soonish(tm).
2) PenUltimate Hop Popping:
I know this issue is not trivial to solve. In fact, once the POP instruction perform, the packet must re-enter in the IP packet processing to determine what action must apply. A possible solution would be to process this packet as a new incoming IP packet when output interface is the loopback disregarding the IP address value. But, this issue is less urgent than the first one. Our OSPF Segment Routing implementation could announce if the router works in PenUltimate Hop Poping mode or not. So, for the moment, the option is force to yes.
thanks!
donald
_______________________________________________ frr mailing list frr@lists.nox.tf https://lists.nox.tf/listinfo/frr
_______________________________________________ frr mailing list frr@lists.nox.tf https://lists.nox.tf/listinfo/frr
_______________________________________________ frr mailing list frr@lists.nox.tf https://lists.nox.tf/listinfo/frr
_______________________________________________ frr mailing list frr@lists.nox.tf https://lists.nox.tf/listinfo/frr
It'll pass whatever label is passed on registration. I can send an example... Lou On March 27, 2017 4:32:04 PM Marc Sune <marc@voltanet.io> wrote:
Hi all,
Kind of an aside question about MPLS implementation in FRR;
On Mon, Mar 20, 2017 at 5:38 PM, Thomas Morin <thomas.morin@orange.com> wrote:
Hi everyone,
2017-03-18, Vincent Jardin:
+Thomas to be on track.
Vincent, you'll tell if what is below helped or not :)
Le 18 mars 2017 06:19:47 Vivek Venkatraman <vivek@cumulusnetworks.com> <vivek@cumulusnetworks.com> a écrit :
This is correct. By definition, if a router is the penultimate hop, it means the actual egress is downstream and has signaled (advertised) an implicit-null label to this router. The router doing the PHP knows the next hop to forward to (the egress) without doing any additional lookup.
(Note that with BGP/MPLS VPNs this is the typical behavior, but it is not a mandatory behavior: the egress router may have advertise a real label (i.e. not implicit null) in which case the penultimate router will swap the topmost label of the stack, not seeing/touching the vpn label. This is also a behavior that relates to the use of MPLS for transit, but with MPLS-over-GRE or MPLS-over-UDP, MPLS can be used with IP transit, in which case this behavior is not used).
In regular MPLS processing (no SR), does FRR currently support advertising any label other than implicit NULL for MPLS traffic termination(itself)? And for L2VPNs?
thanks Marc
This behavior should already be supported.
Yes, I can confirm that forwarding via a neighbor on an interface based on the incoming MPLS label is supported. This is what we use in bagpipe IP VPN 'linux' driver [1].
What is not supported (if I remember right) is the ability on the egress to terminate a label and perform a (route) lookup. That is needed to really be able to support any L2/L3 VPN service properly.
I think it is a requirement to have something efficient to trigger a lookup in any {routing table, vrf interface, netns}.
I hadn't tried (because no need). I thought we might achieve something like that by forwarding the packet on 'lo', or on a vrf interface, or on a veth device: wouldn't this kind of next hop specification trigger a re-enter of the packet in the IP stack after the pop operation ?
-Thomas
[1] http://git.openstack.org/cgit/openstack/networking-bagpipe/t ree/networking_bagpipe/bagpipe_bgp/vpn/ipvpn/mpls_linux_dataplane.py#n194
On Thu, Mar 16, 2017 at 10:45 AM, Jeff Tantsura <jefftant@gmail.com> wrote:
Donald,
Wrt PHP, this is incorrect, PHP node MUST not perform IP lookup, or in fact any lookup after POP. In most cases (labeled services, L2/L3 VPN) there's another label(s) in the stack, looking it up would be fatal.
Regards, Jeff
On Mar 15, 2017, at 08:05, Donald Sharp <sharpd@cumulusnetworks.com> wrote:
David/Roopa -
Olivier asked me about these two issues yesterday in the FRR Technical Meeting. I just wanted to make sure I didn't loose track of these questions that he had:
1) More than 2 labels in the kernel at a time, when will this be allowed in the kernel?
-> David is currently working on this issue. When he is done it will be upstreamed. So soonish(tm).
2) PenUltimate Hop Popping:
I know this issue is not trivial to solve. In fact, once the POP instruction perform, the packet must re-enter in the IP packet processing to determine what action must apply. A possible solution would be to process this packet as a new incoming IP packet when output interface is the loopback disregarding the IP address value. But, this issue is less urgent than the first one. Our OSPF Segment Routing implementation could announce if the router works in PenUltimate Hop Poping mode or not. So, for the moment, the option is force to yes.
thanks!
donald
_______________________________________________ frr mailing list frr@lists.nox.tf https://lists.nox.tf/listinfo/frr
_______________________________________________ frr mailing list frr@lists.nox.tf https://lists.nox.tf/listinfo/frr
_______________________________________________ frr mailing list frr@lists.nox.tf https://lists.nox.tf/listinfo/frr
_______________________________________________ frr mailing list frr@lists.nox.tf https://lists.nox.tf/listinfo/frr
---------- _______________________________________________ frr mailing list frr@lists.nox.tf https://lists.nox.tf/listinfo/frr
Hi Marc, On 3/27/2017 4:30 PM, Marc Sune wrote:
In regular MPLS processing (no SR), does FRR currently support advertising any label other than implicit NULL for MPLS traffic termination(itself)? And for L2VPNs?
If you're using the BGP RFAPI, labels can be provided as a VN option during registers (from rfapi.h: struct rfapi_l2address_option/RFAPI_VN_OPTION_TYPE_L2ADDR) . Here's a code sample: rfapi_register_action action; struct rfapi_vn_option vo; struct rfapi_l2address_option *l2o = NULL; action = (add ? RFAPI_REGISTER_ADD : RFAPI_REGISTER_WITHDRAW); memset (&vo, 0, sizeof (vo)); vo.type = RFAPI_VN_OPTION_TYPE_L2ADDR; l2o = &vo.v.l2addr; l2o->macaddr = <mac|0=IP+label>; l2o->label = <label>; l2o->logical_net_id = <EVPN Ethernet Segment Id>//must be > 0 for L2 mac registration l2o->local_nve_id = (uint8_t) <0|redundant_port_id> l2o->tag_id = <0=default|2 bytes copied to additional RT> return rfapi_register (rfd, prefix, lifetime, NULL, &vo, action); Also there is some related (optional) configuration, see the VNC L2 configuration manual section (http://labn.net/frr.html/VNC-L2-Group-Configuration.html#VNC-L2-Group-Config...) Lou
participants (13)
-
Amine Kherbouche -
David Ahern -
Donald Sharp -
Jeff Tantsura -
Lou Berger -
Marc Sune -
Olivier Dugeon -
olivier.dugeon@orange.com -
Renato Westphal -
Roopa Prabhu -
Thomas Morin -
Vincent Jardin -
Vivek Venkatraman