Changing BGP route map mid-session causes zebra to spin out of control
TL;DR: Having a '0.0.0.0/0 le 29' in a BGP prefix-list on a BGP session that has a full table can freak out zebra. If I go in and change a route-map, the box load goes to 100+ and starts dropping packets. The route-map had this prefix-list line ip prefix-list DEFAULT-IN-ip4 seq 99 permit 0.0.0.0/0 le 29 that seemed to break things, but changing it to: ip prefix-list DEFAULT-IN-ip4 seq 99 permit 0.0.0.0/0 le 24 fixes it. Repeatable: launch FRR with route-map COGENT-IN deny 10 match ip address prefix-list AS32329-ip4 route-map COGENT-IN permit 20 match ip address prefix-list COGENT-TEST-ONE-BLOCK set local-preference 100 set weight 101 Wait a few minutes, then change to route-map like this: orange(config)# route-map COGENT-IN permit 20 orange(config-route-map)# match ip address prefix-list DEFAULT-IN-ip4 orange(config-route-map)# set local-preference 100 orange(config-route-map)# set weight 101 And then load goes over 100 Here are the lists.... ip prefix-list COGENT-TEST-ONE-BLOCK seq 10 permit 38.0.0.0/8 le 24 ip prefix-list COGENT-TEST-ONE-BLOCK seq 20 deny any ip prefix-list DEFAULT-IN-ip4 seq 11 deny 10.0.0.0/8 le 32 ip prefix-list DEFAULT-IN-ip4 seq 12 deny 172.16.0.0/12 le 32 ip prefix-list DEFAULT-IN-ip4 seq 13 deny 192.168.0.0/16 le 32 ip prefix-list DEFAULT-IN-ip4 seq 14 deny 169.254.0.0/16 le 32 ip prefix-list DEFAULT-IN-ip4 seq 15 deny 127.0.0.0/8 le 32 ip prefix-list DEFAULT-IN-ip4 seq 99 permit 0.0.0.0/0 le 29 Output of truss on 'zebra' (I don't claim to really understand this output, other than reading 'Resource temporarily unavailable'. # truss -p 919 seteuid(0xa8) = 0 (0x0) read(6,"\M-H\0\^E\^B\^F\0\0\0B\M^@\0\0\a"...,1192) = 200 (0xc8) seteuid(0x0) = 0 (0x0) write(4,"\^A",1) = 1 (0x1) getrusage(RUSAGE_THREAD,{ u=27.163188,s=34.122484,in=5,out=7 }) = 0 (0x0) write(7,"\M-H\0\^E\^B\^F\0\0\0\^A\M^@\0\0"...,200) = 200 (0xc8) poll({ 6/POLLIN 12/POLLIN 15/POLLIN 16/POLLIN 3/POLLIN },5,0) = 2 (0x2) read(3,"\^A",64) = 1 (0x1) read(3,0x7fffffffe9b0,64) ERR#35 'Resource temporarily unavailable' write(7,"\M-H\0\^E\^A\^D\0\0\0\^C\M^@\0\0"...,200) = 200 (0xc8) seteuid(0xa8) = 0 (0x0) seteuid(0x0) = 0 (0x0) write(7,"\M-H\0\^E\^B\^F\0\0\0\^A\M^@\0\0"...,200) = 200 (0xc8) getrusage(RUSAGE_THREAD,{ u=27.163263,s=34.122484,in=5,out=7 }) = 0 (0x0) write(4,"\^A",1) = 1 (0x1) getrusage(RUSAGE_THREAD,{ u=27.163279,s=34.122484,in=5,out=7 }) = 0 (0x0) write(7,"\M-H\0\^E\^A\^D\0\0\0\^C\M^@\0\0"...,200) = 200 (0xc8) getrusage(RUSAGE_THREAD,{ u=27.163294,s=34.122484,in=5,out=7 }) = 0 (0x0) seteuid(0xa8) = 0 (0x0) read(6,"\M-H\0\^E\^A\^D\0\0\0C\M^@\0\0\a"...,1192) = 200 (0xc8) write(4,"\^A",1) = 1 (0x1) getrusage(RUSAGE_THREAD,{ u=27.163335,s=34.122484,in=5,out=7 }) = 0 (0x0) seteuid(0x0) = 0 (0x0) poll({ 6/POLLIN 12/POLLIN 15/POLLIN 16/POLLIN 3/POLLIN },5,0) = 2 (0x2) read(3,"\^A\^A",64) = 2 (0x2) read(3,0x7fffffffe9b0,64) ERR#35 'Resource temporarily unavailable' write(7,"\M-H\0\^E\^B\^F\0\0\0\^A\M^@\0\0"...,200) = 200 (0xc8) getrusage(RUSAGE_THREAD,{ u=27.163391,s=34.122484,in=5,out=7 }) = 0 (0x0) write(4,"\^A",1) = 1 (0x1) getrusage(RUSAGE_THREAD,{ u=27.163421,s=34.122484,in=5,out=7 }) = 0 (0x0) write(7,"\M-H\0\^E\^A\^D\0\0\0\^C\M^@\0\0"...,200) = 200 (0xc8) getrusage(RUSAGE_THREAD,{ u=27.163444,s=34.122484,in=5,out=7 }) = 0 (0x0) read(6,"\M-H\0\^E\^B\^F\0\0\0B\M^@\0\0\a"...,1192) = 200 (0xc8) seteuid(0xa8) = 0 (0x0) write(4,"\^A",1) = 1 (0x1) seteuid(0x0) = 0 (0x0) getrusage(RUSAGE_THREAD,{ u=27.163492,s=34.122484,in=5,out=7 }) = 0 (0x0) poll({ 6/POLLIN 12/POLLIN 15/POLLIN 16/POLLIN 3/POLLIN },5,0) = 2 (0x2) read(3,"\^A\^A",64) = 2 (0x2) write(7,"\M-H\0\^E\^B\^F\0\0\0\^A\M^@\0\0"...,200) = 200 (0xc8) read(3,0x7fffffffe9b0,64) ERR#35 'Resource temporarily unavailable' write(7,"\M-H\0\^E\^A\^D\0\0\0\^C\M^@\0\0"...,200) = 200 (0xc8) I'm not positive I figured this out... maybe it is late at night and traffic levels are down. Thanks, Rudy
participants (1)
-
Rudy Rucker