[dev] bug fix: ospf6d self orig MaxAge LSA

Yasuhiro Ohara yasu at nttv6.jp
Thu Dec 5 00:58:34 EST 2019


Hi Chirag,

Attached is the debug log. You can see ismore_recent is -1
because the SeqNum is larger in the floating-in-the-net version.

- ospf6d is restarting as router-id: A.B.C.2
- it receives from A.B.C.16 the self-originated Router-LSA
  with MaxAge and SeqNum: 0x80008b6. The local version
  (``the database copy'') is 0x80000004 (or something like that).
- MaxAge LSA is not installed: self orig MAXAGE: discadrd.
- the next origination of Router-LSA is only SeqNum: 0x80000005,
  not 0x800008b7.

In my opinion, if the code snippet is not removed,
the ospf6d is broken, unfortunately.

If you want to keep the code snippet, adding another comparison
about SeqNum might do, but I'd say it will be a duplicated process
since ismore_recent should compare essentially the same.

Best regards,
Yasu

From: Chirag Shah <chirag at cumulusnetworks.com>
Subject: Re: [dev] bug fix: ospf6d self orig MaxAge LSA
Date: Wed, 4 Dec 2019 14:02:21 -0800
Message-ID: <CAGiBS0+X+u9GtXtBQkC28hPtpkXQhCdiQ+ve8WSQtSeyLBmdjg at mail.gmail.com>

> Hi Yasuhiro/Santhosh,
> 
> The commit has introduced the RFC compliance feature to handle premature
> aging of the LSAs via sending MAXAGEd LSAs. One of the prominent
> occurrences is router reboot (OSPF6d restart).
> 
> 
> Narrative:
> The feature was extensively tested back then and found that there could be
> an instance where MAX_AGEd LSA lingered in routing domain and received post
> processing of most recent lower aged LSA (with the same originated
> router-id).
> 
> So there is a case where database copy is recent and the received MAXAGEd
> LSA should not overwrite it.  How do you want to handle this?  Santhos
> pointed to a snippet of the code which purges old MAX_AGEd LSA but it can
> *only* executed
> if *ismore_recent* is greater than 0.
> 
> @yasu
> In your case where OSPF_MAXAGEd LSA is genuine and should not be dropped. *What
> is the output of the debug where ismore_recent value is printed*?
> 
> If the value of *ismore_recent *is greater than 0 then it is safe to put
> that as part of the if statement rather than @yasu suggestion to remove the
> code snippet.
> 
> @sapk the action should be taken accordingly as mentioned above "narrative"
> case where MAXAGEd LSA could overwrite our recent copy of the LSA and
> flushes out the ospf6 routes.
> 
> My take away:
> I do not think MAXAGE LSA will flush out immediately from routing domain so
> corrective action needs to be taken.
> 
> 
> Regards,
> Chirag
> 
> 
> P.S. It has been roughly more than two years back the ospf6 improvements
> were handled (specially this feature) so my memory is faded about the full
> context of the testing of the scenario. I am beginning to restore the
> context.
> 
> On Wed, Dec 4, 2019 at 8:55 AM Santosh P K via dev <dev at lists.frrouting.org>
> wrote:
> 
>>
>>
>>
>> ---------- Forwarded message ----------
>> From: Santosh P K <sapk at vmware.com>
>> To: Yasuhiro Ohara <yasu at nttv6.jp>, "dev at lists.frrouting.org" <
>> dev at lists.frrouting.org>
>> Cc:
>> Bcc:
>> Date: Wed, 4 Dec 2019 16:54:46 +0000
>> Subject: Re: [dev] bug fix: ospf6d self orig MaxAge LSA
>> Hello Yasu,
>>     You are right that your network will be back-holed until the old
>> MAXAGE is flushed out from neighbors. MAXAGE self originated should be
>> accepted for flushing out old instance from network faster and then
>> generate the new version. By accepting maxaged self originated LSA we
>> trigger event to refresh LSA as part of that we originate new LSA. With
>> early return that is getting short circuited which is causing issue.
>>
>>
>> With the same commit I saw the below code changes too.
>>
>>                         /* Neighbor router sent recent age for LSA,
>>                          * Router could be restarted while current copy is
>>                          * MAXAGEd and not removed.*/
>>                         if (OSPF6_LSA_IS_MAXAGE(old) &&
>>                             !OSPF6_LSA_IS_MAXAGE(new)) {
>>
>>                                 if (is_debug)
>>                                         zlog_debug("%s: Current copy of
>> LSA %s is MAXAGE, but new has recent Age.",
>>                                                    old->name,
>>                                            __PRETTY_FUNCTION__);
>>
>>                                 ospf6_lsa_purge(old);
>>                                 if (new->header->adv_router
>>                                                 != from->ospf6_if->area->
>>                                                         ospf6->router_id)
>>                                         ospf6_flood(from, new);
>>
>>                                 ospf6_install_lsa(new);
>>                                 return;
>>                         }
>>
>> I am not sure if this is right as if there is a MAXAGE copy in local
>> database then we should not be accepting new instance of the same as router
>> wishes to FLUSH it from the network. Not sure why this is done. Chirag who
>> did this changes can help with some background.
>>
>> Thanks
>> Santosh P K
>>
>> On 04/12/19, 5:05 PM, "dev on behalf of Yasuhiro Ohara" <
>> dev-bounces at lists.frrouting.org on behalf of yasu at nttv6.jp> wrote:
>>
>>
>>     Hi,
>>
>>     Today I fixed a bug in ospf6d.
>>     The bug seems to be introduced in the following commit.
>>
>>     commit 76249532faadfb429f46dd94cf6bbc61d78b3f26
>>     Date:   Fri Jan 26 14:53:43 2018 -0800
>>
>>         ospf6d: Handle Premature Aging of LSAs
>>
>>     When ospf6d is restarting, self originated MaxAged LSAs
>>     may be floating around in the network, and the rebooting ospf6d
>>     is likely to receive them from the neighbor, probably with the
>>     higher LS-Seq-Num (which is more recent).
>>     The code introduced by the above commit in ospf6_flood.c:
>>     @@ -806,6 +818,17 @@
>>     was removing the more-recent self-originated MaxAged LSAs
>>     immediately in the receiving process of the LSA, prevent the
>>     LSAs from being installed in the local LSDB.
>>     The just rebooted ospf6d will have the lower LS-Seq-Num, so
>>     it cannot refresh the higher LS-Seq-Num version of the old LSA.
>>     So the ospf6d essentially couldn't advertise any information
>>     when the situation occured.
>>
>>     The bug was significant: the LSAs that can't be updated
>>     in my network included the Router-LSA and the Intra-Area-Prefix-LSA,
>>     so the address in the loopback I/F was not advertised
>>     and the BGP session cannot be established at all.
>>     At some point, the MaxAged LSAs floating in the
>>     network may vanish (I don't know the reason why yet),
>>     and sometimes the problem doesn't occur.
>>
>>     Commenting out the code by #if 0 immediately solved the problem:
>>     the self-originated MaxAged LSAs are installed in the local LSDB
>>     once, and then ospf6d refreshes it with the updated contents,
>>     and then floods.
>>
>>     I will make a pull-request later.
>>
>>     ~/frr/ospf6d# git diff
>>     diff --git a/ospf6d/ospf6_flood.c b/ospf6d/ospf6_flood.c
>>     index 0828c2beb..67172dd4a 100644
>>     --- a/ospf6d/ospf6_flood.c
>>     +++ b/ospf6d/ospf6_flood.c
>>     @@ -842,6 +842,7 @@ void ospf6_receive_lsa(struct ospf6_neighbor *from,
>>                                     zlog_debug("Received is duplicated
>> LSA");
>>                             SET_FLAG(new->flag, OSPF6_LSA_DUPLICATE);
>>                     }
>>     +#if 0
>>                     if (old->header->adv_router
>>                                 == from->ospf6_if->area->ospf6->router_id
>>                         && OSPF6_LSA_IS_MAXAGE(new)) {
>>     @@ -854,6 +855,7 @@ void ospf6_receive_lsa(struct ospf6_neighbor *from,
>>                                             ismore_recent);
>>                             return;
>>                     }
>>     +#endif /*0*/
>>             }
>>
>>             /* if no database copy or received is more recent */
>>
>>     Best regards,
>>     Yasu
>>
>>
>>
>>     _______________________________________________
>>     dev mailing list
>>     dev at lists.frrouting.org
>>
>> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.frrouting.org%2Flistinfo%2Fdev&data=02%7C01%7Csapk%40vmware.com%7C673ea1bf82b34a5e145508d778ae1a0b%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637110561519486562&sdata=aB5iLq7zAIzebA1Ex58gldD95x%2B6yr4oFH0nZkgPUhc%3D&reserved=0
>>
>>
>>
>>
>>
>> ---------- Forwarded message ----------
>> From: Santosh P K via dev <dev at lists.frrouting.org>
>> To: Yasuhiro Ohara <yasu at nttv6.jp>, "dev at lists.frrouting.org" <
>> dev at lists.frrouting.org>
>> Cc:
>> Bcc:
>> Date: Wed, 04 Dec 2019 08:55:32 -0800 (PST)
>> Subject: Re: [dev] bug fix: ospf6d self orig MaxAge LSA
>> _______________________________________________
>> dev mailing list
>> dev at lists.frrouting.org
>> https://lists.frrouting.org/listinfo/dev
>>
-------------- next part --------------

2019/12/04 19:20:52 OSPF6: LSA Receive from A.B.C.16%port-86-0-0
2019/12/04 19:20:52 OSPF6:     [Router Id:0.0.0.0 Adv:A.B.C.2]
2019/12/04 19:20:52 OSPF6:     Age: 3600 SeqNum: 0x800008b6 Cksum: c055 Len: 56
2019/12/04 19:20:52 OSPF6: Delayed acknowledgement (BDR & MoreRecent & from DR)
2019/12/04 19:20:52 OSPF6: ospf6_receive_lsa: Received is self orig MAXAGE LSA [Router Id:0.0.0.0 Adv:A.B.C.2], discard (ismore_recent -1)
2019/12/04 19:20:53 OSPF6: LSA Receive from A.B.C.16%port-86-0-0
2019/12/04 19:20:53 OSPF6:     [Intra-Prefix Id:0.0.0.0 Adv:A.B.C.2]
2019/12/04 19:20:53 OSPF6:     Age: 3600 SeqNum: 0x800008b9 Cksum: f76f Len: 52
2019/12/04 19:20:53 OSPF6: Delayed acknowledgement (BDR & MoreRecent & from DR)
2019/12/04 19:20:53 OSPF6: ospf6_receive_lsa: Received is self orig MAXAGE LSA [Intra-Prefix Id:0.0.0.0 Adv:A.B.C.2], discard (ismore_recent -1)
2019/12/04 19:20:57 OSPF6: Originate Router-LSA for Area 0.0.0.0
2019/12/04 19:20:57 OSPF6: Suppress updating LSA: [Router Id:0.0.0.0 Adv:A.B.C.2]
2019/12/04 19:20:57 OSPF6: Originate Router-LSA for Area 0.0.0.0
2019/12/04 19:20:57 OSPF6: LSA Originate:
2019/12/04 19:20:57 OSPF6:     [Router Id:0.0.0.0 Adv:A.B.C.2]
2019/12/04 19:20:57 OSPF6:     Age:    0 SeqNum: 0x80000005 Cksum: 3c93 Len: 56
2019/12/04 19:20:57 OSPF6: SPF: Scheduled in 0 msec
2019/12/04 19:20:57 OSPF6: Flooding on port-86-0-0: [Router Id:0.0.0.0 Adv:A.B.C.2]



More information about the dev mailing list