bug fix: ospf6d self orig MaxAge LSA
Hi, Today I fixed a bug in ospf6d. The bug seems to be introduced in the following commit. commit 76249532faadfb429f46dd94cf6bbc61d78b3f26 Date: Fri Jan 26 14:53:43 2018 -0800 ospf6d: Handle Premature Aging of LSAs When ospf6d is restarting, self originated MaxAged LSAs may be floating around in the network, and the rebooting ospf6d is likely to receive them from the neighbor, probably with the higher LS-Seq-Num (which is more recent). The code introduced by the above commit in ospf6_flood.c: @@ -806,6 +818,17 @@ was removing the more-recent self-originated MaxAged LSAs immediately in the receiving process of the LSA, prevent the LSAs from being installed in the local LSDB. The just rebooted ospf6d will have the lower LS-Seq-Num, so it cannot refresh the higher LS-Seq-Num version of the old LSA. So the ospf6d essentially couldn't advertise any information when the situation occured. The bug was significant: the LSAs that can't be updated in my network included the Router-LSA and the Intra-Area-Prefix-LSA, so the address in the loopback I/F was not advertised and the BGP session cannot be established at all. At some point, the MaxAged LSAs floating in the network may vanish (I don't know the reason why yet), and sometimes the problem doesn't occur. Commenting out the code by #if 0 immediately solved the problem: the self-originated MaxAged LSAs are installed in the local LSDB once, and then ospf6d refreshes it with the updated contents, and then floods. I will make a pull-request later. ~/frr/ospf6d# git diff diff --git a/ospf6d/ospf6_flood.c b/ospf6d/ospf6_flood.c index 0828c2beb..67172dd4a 100644 --- a/ospf6d/ospf6_flood.c +++ b/ospf6d/ospf6_flood.c @@ -842,6 +842,7 @@ void ospf6_receive_lsa(struct ospf6_neighbor *from, zlog_debug("Received is duplicated LSA"); SET_FLAG(new->flag, OSPF6_LSA_DUPLICATE); } +#if 0 if (old->header->adv_router == from->ospf6_if->area->ospf6->router_id && OSPF6_LSA_IS_MAXAGE(new)) { @@ -854,6 +855,7 @@ void ospf6_receive_lsa(struct ospf6_neighbor *from, ismore_recent); return; } +#endif /*0*/ } /* if no database copy or received is more recent */ Best regards, Yasu
Hello Yasu, You are right that your network will be back-holed until the old MAXAGE is flushed out from neighbors. MAXAGE self originated should be accepted for flushing out old instance from network faster and then generate the new version. By accepting maxaged self originated LSA we trigger event to refresh LSA as part of that we originate new LSA. With early return that is getting short circuited which is causing issue. With the same commit I saw the below code changes too. /* Neighbor router sent recent age for LSA, * Router could be restarted while current copy is * MAXAGEd and not removed.*/ if (OSPF6_LSA_IS_MAXAGE(old) && !OSPF6_LSA_IS_MAXAGE(new)) { if (is_debug) zlog_debug("%s: Current copy of LSA %s is MAXAGE, but new has recent Age.", old->name, __PRETTY_FUNCTION__); ospf6_lsa_purge(old); if (new->header->adv_router != from->ospf6_if->area-> ospf6->router_id) ospf6_flood(from, new); ospf6_install_lsa(new); return; } I am not sure if this is right as if there is a MAXAGE copy in local database then we should not be accepting new instance of the same as router wishes to FLUSH it from the network. Not sure why this is done. Chirag who did this changes can help with some background. Thanks Santosh P K On 04/12/19, 5:05 PM, "dev on behalf of Yasuhiro Ohara" <dev-bounces@lists.frrouting.org on behalf of yasu@nttv6.jp> wrote: Hi, Today I fixed a bug in ospf6d. The bug seems to be introduced in the following commit. commit 76249532faadfb429f46dd94cf6bbc61d78b3f26 Date: Fri Jan 26 14:53:43 2018 -0800 ospf6d: Handle Premature Aging of LSAs When ospf6d is restarting, self originated MaxAged LSAs may be floating around in the network, and the rebooting ospf6d is likely to receive them from the neighbor, probably with the higher LS-Seq-Num (which is more recent). The code introduced by the above commit in ospf6_flood.c: @@ -806,6 +818,17 @@ was removing the more-recent self-originated MaxAged LSAs immediately in the receiving process of the LSA, prevent the LSAs from being installed in the local LSDB. The just rebooted ospf6d will have the lower LS-Seq-Num, so it cannot refresh the higher LS-Seq-Num version of the old LSA. So the ospf6d essentially couldn't advertise any information when the situation occured. The bug was significant: the LSAs that can't be updated in my network included the Router-LSA and the Intra-Area-Prefix-LSA, so the address in the loopback I/F was not advertised and the BGP session cannot be established at all. At some point, the MaxAged LSAs floating in the network may vanish (I don't know the reason why yet), and sometimes the problem doesn't occur. Commenting out the code by #if 0 immediately solved the problem: the self-originated MaxAged LSAs are installed in the local LSDB once, and then ospf6d refreshes it with the updated contents, and then floods. I will make a pull-request later. ~/frr/ospf6d# git diff diff --git a/ospf6d/ospf6_flood.c b/ospf6d/ospf6_flood.c index 0828c2beb..67172dd4a 100644 --- a/ospf6d/ospf6_flood.c +++ b/ospf6d/ospf6_flood.c @@ -842,6 +842,7 @@ void ospf6_receive_lsa(struct ospf6_neighbor *from, zlog_debug("Received is duplicated LSA"); SET_FLAG(new->flag, OSPF6_LSA_DUPLICATE); } +#if 0 if (old->header->adv_router == from->ospf6_if->area->ospf6->router_id && OSPF6_LSA_IS_MAXAGE(new)) { @@ -854,6 +855,7 @@ void ospf6_receive_lsa(struct ospf6_neighbor *from, ismore_recent); return; } +#endif /*0*/ } /* if no database copy or received is more recent */ Best regards, Yasu _______________________________________________ dev mailing list dev@lists.frrouting.org https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.frrouting.org%2Flistinfo%2Fdev&data=02%7C01%7Csapk%40vmware.com%7C673ea1bf82b34a5e145508d778ae1a0b%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637110561519486562&sdata=aB5iLq7zAIzebA1Ex58gldD95x%2B6yr4oFH0nZkgPUhc%3D&reserved=0
participants (2)
-
Santosh P K -
Yasuhiro Ohara