regarding - ospf-loop issue with the given topology

Palpandi Perumal palp at pluribusnetworks.com
Tue Nov 12 07:26:32 EST 2019


Hi All,
Please find the topology attached.

Whenever the node "uspine02" was rebooted, we ended in an ospf loop
intermittently.
 We are seeing one anomaly sequence and we are not able to get the trigger
point of it.
Once we ended up in the promatic state.
>From the debug log,
The rebooted us-spine2 says, the received network-lsa is originated from
me, so i should ignore that lsa to flood in broadcast domain and set the
max-age that i have already installed.In this case, LS ack will not be sent
from us-spine2, so gh-core1 again will add that LSA in re-transmit list and
will process next 10s interval. In the mean-time, us-spine2 will broadcast
the max-age LSA to ghcore for removing that route. Once that route was
removed by us-spine2 max-age network-lsa, us-spine1 will send proper
network-lsa immediately to ghcore1 to re-install that route so after that
event, re-transmit list will be processed again on gh-core1, will send that
network-lsa to us-spine2 and us-spine2 will be seeing that lsa as
self-originated LSA and will do the above thing again. ----------> this
anomaly sequence led our switches in that state.

It is a timing problem. That is the reason we are not hitting consistently.
*Root cause:*
Before uspine2 goes to reboot, it would have been in DR and would have
generated one network-lsa to this broadcast domain area 204.
That LSA would have received on ghcore1. ghcore1 considered that LSA as
proper LSA and installed it and ghcore1 flood that LSA again to area 204 at
that time uspine2 would have established back with BDR and received the
same LSA and considered that LSA as self-originated LSA.

*Potential fix based on the RFC 2328 section 13:*
Before flooding the LSA to broadcast domain.Check whether the received is
self-originated.

diff -r dc50bb05b29e usr/src/cmd/FRRouting/frr-master/ospfd/ospf_flood.c
--- a/usr/src/cmd/FRRouting/frr-master/ospfd/ospf_flood.c Tue Nov 05
05:44:37 2019 -0800
+++ b/usr/src/cmd/FRRouting/frr-master/ospfd/ospf_flood.c Tue Nov 05
08:51:51 2019 -0800
@@ -925,6 +925,26 @@
  old->retransmit_counter--;
  ospf_lsdb_delete(&nbr->ls_rxmt, old);
  }
+              /*
+  * Please refer section 13.1 in RFC 2328
+  * Flooding procedure is not applicable for self
+  * originating lsa. Unfortunately we ended up the
+  * self-originated lsa to be added in retransmit list
+  * through flood caller.
+               * while adding this lsa to re-transmit list,
+               * need to confirm whether this is self-originated lsa.
+  * If its, it should get remove in lsdb and shouldnt add
+  * in retransmit list.
+               */
+ if (ospf_lsa_is_self_originated(nbr->oi->ospf, lsa)) {
+ if (IS_DEBUG_OSPF(lsa, LSA_FLOODING))
+ zlog_debug("self originated RXmtL(%lu)++,"
+ " NBR(%s), LSA[%s]",
+ ospf_ls_retransmit_count(nbr),
+ inet_ntoa(nbr->router_id),
+ dump_lsa_key(lsa));
+ return;
+ }


Thanks
Palpandi P
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.frrouting.org/pipermail/dev/attachments/20191112/2c1b8510/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ospf_topo.png
Type: image/png
Size: 227519 bytes
Desc: not available
URL: <http://lists.frrouting.org/pipermail/dev/attachments/20191112/2c1b8510/attachment-0001.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ospf_flood.c
Type: application/octet-stream
Size: 31433 bytes
Desc: not available
URL: <http://lists.frrouting.org/pipermail/dev/attachments/20191112/2c1b8510/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ospf_lsa_loop.diff
Type: application/octet-stream
Size: 1596 bytes
Desc: not available
URL: <http://lists.frrouting.org/pipermail/dev/attachments/20191112/2c1b8510/attachment-0003.obj>


More information about the dev mailing list