Anyone interested in meeting up this week? How about continuing the traditional Thursday lunchtime in the hotel lobby? Lou PS I'd be happy to discuss the 1.x v4 BGP issue that some have reported...
I'll be there. donald On Mon, Mar 27, 2017 at 6:40 AM, Lou Berger <lberger@labn.net> wrote:
Anyone interested in meeting up this week? How about continuing the traditional Thursday lunchtime in the hotel lobby?
Lou
PS I'd be happy to discuss the 1.x v4 BGP issue that some have reported...
_______________________________________________ frr mailing list frr@lists.nox.tf https://lists.nox.tf/listinfo/frr
I’ll join. PS: Which “1.x v4 BGP issue” ? (Pointer?) Quagga 1.x BGP issue? - Martin On 27 Mar 2017, at 4:40, Lou Berger wrote:
Anyone interested in meeting up this week? How about continuing the traditional Thursday lunchtime in the hotel lobby?
Lou
PS I'd be happy to discuss the 1.x v4 BGP issue that some have reported...
_______________________________________________ frr mailing list frr@lists.nox.tf https://lists.nox.tf/listinfo/frr
I thought I was referring to a https://bugzilla.quagga.net/show_bug.cgi?id=870 but in rereading the description that the problem I've been looking into is only impacting v4 so is probably a different one... On 3/27/2017 2:13 PM, Martin Winter wrote:
I’ll join.
PS: Which “1.x v4 BGP issue” ? (Pointer?) Quagga 1.x BGP issue?
- Martin
On 27 Mar 2017, at 4:40, Lou Berger wrote:
Anyone interested in meeting up this week? How about continuing the traditional Thursday lunchtime in the hotel lobby?
Lou
PS I'd be happy to discuss the 1.x v4 BGP issue that some have reported...
_______________________________________________ frr mailing list frr@lists.nox.tf https://lists.nox.tf/listinfo/frr
On 3/27/2017 3:15 PM, Lou Berger wrote:
I thought I was referring to a https://bugzilla.quagga.net/show_bug.cgi?id=870 but in rereading the description that the problem I've been looking into is only impacting v4 so is probably a different one...
FYI, I see this as well on ipv4 prefixes with Quagga 1.2.1. Not sure if its the same bug or not in bugID 870, but similar behaviour as well as the problem router showing a positive number in the OutQ in the bgp summary. One other thing I noticed-- If on my problematic router, I bring up all my ibgp peers first, and then bring up my external peers, my ibgp peers see close to 120K prefixes (which they should). If for some reason, an ibgp peer gets reset, after that, they will only see a subset of peers (20K). In the mean time, the peers that have not lost their initial session, will slowly over a period of days start to loose the amount of prefixes they should be seeing. ---Mike
On 3/27/2017 2:13 PM, Martin Winter wrote:
I’ll join.
PS: Which “1.x v4 BGP issue” ? (Pointer?) Quagga 1.x BGP issue?
- Martin
On 27 Mar 2017, at 4:40, Lou Berger wrote:
Anyone interested in meeting up this week? How about continuing the traditional Thursday lunchtime in the hotel lobby?
Lou
PS I'd be happy to discuss the 1.x v4 BGP issue that some have reported...
_______________________________________________ frr mailing list frr@lists.nox.tf https://lists.nox.tf/listinfo/frr
_______________________________________________ frr mailing list frr@lists.nox.tf https://lists.nox.tf/listinfo/frr
-- ------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada http://www.tancsa.com/
Mike, Do you happen to know if this happens with FRR as well? - Martin On 27 Mar 2017, at 12:24, Mike Tancsa wrote:
On 3/27/2017 3:15 PM, Lou Berger wrote:
I thought I was referring to a https://bugzilla.quagga.net/show_bug.cgi?id=870 but in rereading the description that the problem I've been looking into is only impacting v4 so is probably a different one...
FYI, I see this as well on ipv4 prefixes with Quagga 1.2.1. Not sure if its the same bug or not in bugID 870, but similar behaviour as well as the problem router showing a positive number in the OutQ in the bgp summary. One other thing I noticed-- If on my problematic router, I bring up all my ibgp peers first, and then bring up my external peers, my ibgp peers see close to 120K prefixes (which they should). If for some reason, an ibgp peer gets reset, after that, they will only see a subset of peers (20K). In the mean time, the peers that have not lost their initial session, will slowly over a period of days start to loose the amount of prefixes they should be seeing.
---Mike
On 3/27/2017 2:13 PM, Martin Winter wrote:
I’ll join.
PS: Which “1.x v4 BGP issue” ? (Pointer?) Quagga 1.x BGP issue?
- Martin
On 27 Mar 2017, at 4:40, Lou Berger wrote:
Anyone interested in meeting up this week? How about continuing the traditional Thursday lunchtime in the hotel lobby?
Lou
PS I'd be happy to discuss the 1.x v4 BGP issue that some have reported...
_______________________________________________ frr mailing list frr@lists.nox.tf https://lists.nox.tf/listinfo/frr
_______________________________________________ frr mailing list frr@lists.nox.tf https://lists.nox.tf/listinfo/frr
-- ------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada http://www.tancsa.com/
On 3/27/2017 3:37 PM, Martin Winter wrote:
Mike,
Do you happen to know if this happens with FRR as well?
I was hoping to test. I dont have an easy way to reproduce it in the lab, but I could probably run it on the problem box during an outage window late at night. It only takes a few min to see the problem come up. Can I test against frr-2.0-rc2 ? Any debugging you want me to turn on when I test it ? ---Mike -- ------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada http://www.tancsa.com/
On 27 Mar 2017, at 13:07, Mike Tancsa wrote:
On 3/27/2017 3:37 PM, Martin Winter wrote:
Mike,
Do you happen to know if this happens with FRR as well?
I was hoping to test. I dont have an easy way to reproduce it in the lab, but I could probably run it on the problem box during an outage window late at night. It only takes a few min to see the problem come up.
Can I test against frr-2.0-rc2 ? Any debugging you want me to turn on when I test it ?
I think, the first step would be to just see if it still exists. I did some troubleshooting, trying to reproduce this (or very similar) issue on Quagga in the past, but failed to reproduce it. So if it happens on FRR-2.0-rc2 (latest stable/2.0 branch), then I really want to dig into it and trying to reproduce it (and get it fixed). - Martin
------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada http://www.tancsa.com/
On 3/27/2017 4:39 PM, Martin Winter wrote:
So if it happens on FRR-2.0-rc2 (latest stable/2.0 branch), then I really want to dig into it and trying to reproduce it (and get it fixed).
I am having some difficulty building it for FreeBSD. I am not an automake expert. Not sure how to bootstrap the build ? I do autoreconf -i then ./configure --disable-rusage --disable-rtadv --prefix="/usr/local/" it errors with checking for json-c/json.h... no checking for json_object_get in -ljson-c... no checking for json_object_get in -ljson... no configure: error: lib json is needed to compile despite having them installed root@backuprouter:/tmp/frr/tags/2 # ls -l /usr/local/lib/ | grep json -rw-r--r-- 1 root wheel 63432 Feb 25 06:29 libfastjson.a lrwxr-xr-x 1 root wheel 20 Feb 25 06:29 libfastjson.so -> libfastjson.so.4.0.0 lrwxr-xr-x 1 root wheel 20 Feb 25 06:29 libfastjson.so.4 -> libfastjson.so.4.0.0 -rwxr-xr-x 1 root wheel 38688 Feb 25 06:29 libfastjson.so.4.0.0 -rw-r--r-- 1 root wheel 269364 Feb 25 17:30 libjson++.a lrwxr-xr-x 1 root wheel 16 Feb 25 17:30 libjson++.so -> libjson++.so.0.5 lrwxr-xr-x 1 root wheel 18 Feb 25 17:30 libjson++.so.0.5 -> libjson++.so.0.5.0 -rwxr-xr-x 1 root wheel 92184 Feb 25 17:30 libjson++.so.0.5.0 -rw-r--r-- 1 root wheel 75172 Mar 28 11:48 libjson-c.a lrwxr-xr-x 1 root wheel 18 Mar 28 11:48 libjson-c.so -> libjson-c.so.2.0.2 lrwxr-xr-x 1 root wheel 18 Mar 28 11:48 libjson-c.so.2 -> libjson-c.so.2.0.2 -rwxr-xr-x 1 root wheel 44640 Mar 28 11:48 libjson-c.so.2.0.2 lrwxr-xr-x 1 root wheel 27 Feb 25 01:13 libjson-glib-1.0.so -> libjson-glib-1.0.so.0.102.0 lrwxr-xr-x 1 root wheel 27 Feb 25 01:13 libjson-glib-1.0.so.0 -> libjson-glib-1.0.so.0.102.0 -rwxr-xr-x 1 root wheel 160176 Feb 25 01:13 libjson-glib-1.0.so.0.102.0 -rw-r--r-- 1 root wheel 434058 Feb 24 20:24 libjsoncpp.a lrwxr-xr-x 1 root wheel 19 Feb 24 20:24 libjsoncpp.so -> libjsoncpp.so.1.8.0 lrwxr-xr-x 1 root wheel 19 Feb 24 20:24 libjsoncpp.so.1 -> libjsoncpp.so.1.8.0 -r--r--r-- 1 root wheel 238616 Feb 24 20:24 libjsoncpp.so.1.8.0 lrwxr-xr-x 1 root wheel 13 Feb 24 20:46 libqjson.so -> libqjson.so.0 lrwxr-xr-x 1 root wheel 17 Feb 24 20:46 libqjson.so.0 -> libqjson.so.0.9.0 -rwxr-xr-x 1 root wheel 196856 Feb 24 20:46 libqjson.so.0.9.0 root@backuprouter:/tmp/frr/tags/2 # ls -l /usr/local/in include/ info/ root@backuprouter:/tmp/frr/tags/2 # ls -l /usr/local/include/ | grep json drwxr-xr-x 2 root wheel 12 Mar 27 16:13 json drwxr-xr-x 2 root wheel 17 Mar 28 11:48 json-c drwxr-xr-x 3 root wheel 3 Mar 27 16:15 json-glib-1.0 drwxr-xr-x 3 root wheel 3 Mar 27 16:14 jsoncpp drwxr-xr-x 2 root wheel 9 Mar 29 16:29 libfastjson drwxr-xr-x 2 root wheel 8 Mar 29 16:30 qjson root@backuprouter:/tmp/frr/tags/2 # -- ------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada http://www.tancsa.com/
On 3/27/2017 4:39 PM, Martin Winter wrote:
On 27 Mar 2017, at 13:07, Mike Tancsa wrote:
On 3/27/2017 3:37 PM, Martin Winter wrote:
Mike,
Do you happen to know if this happens with FRR as well?
I was hoping to test. I dont have an easy way to reproduce it in the lab, but I could probably run it on the problem box during an outage window late at night. It only takes a few min to see the problem come up.
Can I test against frr-2.0-rc2 ? Any debugging you want me to turn on when I test it ?
I think, the first step would be to just see if it still exists. I did some troubleshooting, trying to reproduce this (or very similar) issue on Quagga in the past, but failed to reproduce it.
So if it happens on FRR-2.0-rc2 (latest stable/2.0 branch), then I really want to dig into it and trying to reproduce it (and get it fixed).
OK, sort of good news sort of indeterminate news. I compiled up a version on FreeBSD #11 stable. The problem might not be there. However, I was not able to bring up all my peers as bgp passwords do not seem to work. I have about 9 peers that I use passwords with and its possible those peers might be triggering the problem. But with the other 20+ peers, I was not able to see the issue, at least for the 15min or so that I have the peers up so far. Other minor details I noticed-- from the configs I cp'd over from Quagga, for whatever reason #1 bgp log-neighbor-changes disappeared I re added it and it seems to work #2 peers seem to take a long time to come up. Suspiciously, about 180secs after a hard clear or start up #3 show ip bgp sum displays all the peers out of order ? Anyways, I will need to get the bgp passwords working before I am more confident to say whether the bug is still there or not. the config seems to say it is configure:18185: checking whether TCP_MD5SIG is declared configure:18185: cc -c -g -Os -fno-omit-frame-pointer -Wall -Wextra -Wmissing-prototypes -Wmissing-declarations -Wpointer-arith -Wbad-function-cast -Wwrite-strings -Wno-unused-result -Wno-unused-parameter -Wno-missing-field-initializers -I/usr/local/include -static conftest.c >&5 configure:18185: $? = 0 configure:18185: result: yes ... #define HAVE_DECL_TCP_MD5SIG 1 | /* end confdefs.h. */ | #include <sys/utsname.h> ---Mike -- ------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada http://www.tancsa.com/
On 3/31/2017 9:47 AM, Mike Tancsa wrote:
OK, sort of good news sort of indeterminate news. I compiled up a version on FreeBSD #11 stable. The problem might not be there. However, I was not able to bring up all my peers as bgp passwords do not seem to work. I have about 9 peers that I use passwords with and its possible those peers might be triggering the problem. But with the other 20+ peers, I was not able to see the issue, at least for the 15min or so that I have the peers up so far.
OK, I managed to get tcp md5 working. I think the issue is with the new interface for ipsec in FreeBSD. I updated the kernel at the same time, and something broke md5 for both quagga and frr. I reverted to the old kernel and bgp passwords are working. However, the issue with the peers being out of order also seems to have mangled my bgp config that I brought over as it too is all mangled. Parts of the config are saved out of order and I think its broken my prefix lists for some peers so I will have to shut this test down for today. But on the plus side, I did NOT see any evidence of the Quagga bug. I was able to hard clear an ibgp peer and all the routes came back as expected. It took just over 2 min for the session to come up, but it did and the outQ stayed at zero. ---Mike
Other minor details I noticed-- from the configs I cp'd over from Quagga, for whatever reason
#1 bgp log-neighbor-changes disappeared
I re added it and it seems to work
#2 peers seem to take a long time to come up. Suspiciously, about 180secs after a hard clear or start up
#3 show ip bgp sum displays all the peers out of order ?
Anyways, I will need to get the bgp passwords working before I am more confident to say whether the bug is still there or not.
the config seems to say it is
configure:18185: checking whether TCP_MD5SIG is declared configure:18185: cc -c -g -Os -fno-omit-frame-pointer -Wall -Wextra -Wmissing-prototypes -Wmissing-declarations -Wpointer-arith -Wbad-function-cast -Wwrite-strings -Wno-unused-result -Wno-unused-parameter -Wno-missing-field-initializers -I/usr/local/include -static conftest.c >&5 configure:18185: $? = 0 configure:18185: result: yes ... #define HAVE_DECL_TCP_MD5SIG 1 | /* end confdefs.h. */ | #include <sys/utsname.h>
---Mike
-- ------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada http://www.tancsa.com/
Mike, Can you open an issue ( https://github.com/FRRouting/frr/issues ) for the problem on the out of order and include the output? Also, some defaults were changed before (i.e. for not seeing some of the defaults), but got changed back in a commit yesterday to match the “traditional” quagga. (Unless you do a “—enable-cumulus” which would set some defaults differently) - Martin On 31 Mar 2017, at 7:29, Mike Tancsa wrote:
On 3/31/2017 9:47 AM, Mike Tancsa wrote:
OK, sort of good news sort of indeterminate news. I compiled up a version on FreeBSD #11 stable. The problem might not be there. However, I was not able to bring up all my peers as bgp passwords do not seem to work. I have about 9 peers that I use passwords with and its possible those peers might be triggering the problem. But with the other 20+ peers, I was not able to see the issue, at least for the 15min or so that I have the peers up so far.
OK, I managed to get tcp md5 working. I think the issue is with the new interface for ipsec in FreeBSD. I updated the kernel at the same time, and something broke md5 for both quagga and frr. I reverted to the old kernel and bgp passwords are working.
However, the issue with the peers being out of order also seems to have mangled my bgp config that I brought over as it too is all mangled. Parts of the config are saved out of order and I think its broken my prefix lists for some peers so I will have to shut this test down for today.
But on the plus side, I did NOT see any evidence of the Quagga bug. I was able to hard clear an ibgp peer and all the routes came back as expected. It took just over 2 min for the session to come up, but it did and the outQ stayed at zero.
---Mike
Other minor details I noticed-- from the configs I cp'd over from Quagga, for whatever reason
#1 bgp log-neighbor-changes disappeared
I re added it and it seems to work
#2 peers seem to take a long time to come up. Suspiciously, about 180secs after a hard clear or start up
#3 show ip bgp sum displays all the peers out of order ?
Anyways, I will need to get the bgp passwords working before I am more confident to say whether the bug is still there or not.
the config seems to say it is
configure:18185: checking whether TCP_MD5SIG is declared configure:18185: cc -c -g -Os -fno-omit-frame-pointer -Wall -Wextra -Wmissing-prototypes -Wmissing-declarations -Wpointer-arith -Wbad-function-cast -Wwrite-strings -Wno-unused-result -Wno-unused-parameter -Wno-missing-field-initializers -I/usr/local/include -static conftest.c >&5 configure:18185: $? = 0 configure:18185: result: yes ... #define HAVE_DECL_TCP_MD5SIG 1 | /* end confdefs.h. */ | #include <sys/utsname.h>
---Mike
-- ------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada http://www.tancsa.com/
On 3/31/2017 11:52 AM, Martin Winter wrote:
Mike,
Can you open an issue ( https://github.com/FRRouting/frr/issues ) for the problem on the out of order and include the output?
Actually it seems to be just a different sorting order. I am used to seeing things sorted by IP address in show ip bgp sum and now they seem to be sorted by peergroup, then by peer IP ? ---Mike -- ------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada http://www.tancsa.com/
Interesting. The problem I was seeing had to do with a default route learned via BGP and redistributed via OSPF not being redistribute properly *sometimes* and changes in system timing resulted in changes of behaviors (from very rare to all the time). I think I have a reliable way to reproduce so plan on trying to run it to ground post IETF... Lou On 3/27/2017 3:24 PM, Mike Tancsa wrote:
On 3/27/2017 3:15 PM, Lou Berger wrote:
I thought I was referring to a https://bugzilla.quagga.net/show_bug.cgi?id=870 but in rereading the description that the problem I've been looking into is only impacting v4 so is probably a different one... FYI, I see this as well on ipv4 prefixes with Quagga 1.2.1. Not sure if its the same bug or not in bugID 870, but similar behaviour as well as the problem router showing a positive number in the OutQ in the bgp summary. One other thing I noticed-- If on my problematic router, I bring up all my ibgp peers first, and then bring up my external peers, my ibgp peers see close to 120K prefixes (which they should). If for some reason, an ibgp peer gets reset, after that, they will only see a subset of peers (20K). In the mean time, the peers that have not lost their initial session, will slowly over a period of days start to loose the amount of prefixes they should be seeing.
---Mike
On 3/27/2017 2:13 PM, Martin Winter wrote:
I’ll join.
PS: Which “1.x v4 BGP issue” ? (Pointer?) Quagga 1.x BGP issue?
- Martin
On 27 Mar 2017, at 4:40, Lou Berger wrote:
Anyone interested in meeting up this week? How about continuing the traditional Thursday lunchtime in the hotel lobby?
Lou
PS I'd be happy to discuss the 1.x v4 BGP issue that some have reported...
_______________________________________________ frr mailing list frr@lists.nox.tf https://lists.nox.tf/listinfo/frr
_______________________________________________ frr mailing list frr@lists.nox.tf https://lists.nox.tf/listinfo/frr
Hi - given the likely logistics issue of finding a place for folks during lunch I reserved a room -- Vevey 4 It is bring your own -- see you there anytime between 11:30-1:00 (I'll be there a few minutes late as I'm going to get my lunch right after the 1st session)... Lou On 3/27/2017 7:40 AM, Lou Berger wrote:
Anyone interested in meeting up this week? How about continuing the traditional Thursday lunchtime in the hotel lobby?
Lou
PS I'd be happy to discuss the 1.x v4 BGP issue that some have reported...
_______________________________________________ frr mailing list frr@lists.nox.tf https://lists.nox.tf/listinfo/frr
participants (4)
-
Donald Sharp -
Lou Berger -
Martin Winter -
Mike Tancsa