NetworkTalk & BGP

A.3.b) “Next-hop-self” usefulness within an Ibgp session

A.3.b) “Next-hop-self” usefulness within an Ibgp session

  1. Why do we need to use the next-hop-self attribute?The BGP next-hop attribute is also necessary when routes received from an Ebgp speaker which plays the role of an edge router, are advertised to an Ibgp speaker within the same AS.
    By default when a route is advertised to an Ebgp outside of the AS, the router will make sure that the next-hop attribute reflects its IP address.
    As a result when a route is advertised to an Ibgp speaker and sourced into the BGP as-group, all Ibgp routers will have for next-hop the ip address of the Ebgp neighbor.
    But what happen if the Ebgp speaker is not reachable? All packets are stills sent and consequently; a black hole can happen.
    To prevent this, we can make sure that a route advertised to an Ibgp router; echoes the IP address of the router sourcing that route into the AS to the Ibgp neighbors; and not the IP address of the Ebgp speaker which originally advertised this route.
    We know that BGP always make sure that a “hop/destination” is reachable before advertising; if the hop is not reachable, the route will still be held in the BGP table…
    To avoid potential routing black-holes, it is better to make use of the “next-hope-self” attribute to force the Ibgp speaker to set the next-hop of the route advertised to its own IP address.
  2. Forgetting the next-hop-self instruction can provoke many other bugs like CPU high-consumption resources and so deteriorates the quality of service.

    Let’s give a look to the following example(Figure A.6 below), where the next-self instruction between RT-B and RT-A has been deliberately omitted.

If we observe the CPU utilization of both routers, we can notice from the graphs and additional screen captures below, that RT-A is running high on CPU most of the time, much higher than its Ibgp neighbor RT-B, even though it handles less traffic in comparison.



Note that CPU utilization on RT-A is much higher than that of RT-B on average. Other symptoms of high-cpu on RT-A not exhibited by RT-B are a sluggish telnet session establishment and poor response to commands coupled with ICMP packet drops among other things.
Investigation showed a large number of syslog messages that hinted about the possibility that the BGP configurations on RT-B were to blame for the high CPU utilization. The large amount of messages being logged necessitated setup of a syslog server to allow for better log analysis since the router’s internal buffer was being overwritten very quickly.

A capture of these messages is shown below.


Moreover we have the following messages:

Apr 16 08:24:07.339 UTC: BGP(0): Revise route installing 1 of 1 routes for —> (main) to main IP table
Apr 16 08:24:07.339 UTC: BGP(0): Revise route installing 1 of 1 routes for —> (main) to main IP table
Apr 16 08:24:07.339 UTC: BGP(0): Revise route installing 1 of 1 routes for —> (main) to main IP table
Apr 16 08:24:07.339 UTC: BGP(0): Revise route installing 1 of 1 routes for —> (main) to main IP table
Apr 16 08:24:07.339 UTC: BGP(0): Revise route installing 1 of 1 routes for —> (main) to main IP table …

The above output shows that the BGP routes (entire Internet routing table) being learnt from RT-B keep being flushed out of the routing table and then being re-installed periodically. These routes are being advertised with a next-hop of “” which is the IP-address of ISP-B Ebgp speaker. From this output and the next one, we find that RT-A is advertising all Internet routes to RT-B with a next-hop of ISP-B’s IP-address. This means that RT-A will need to perform a recursive lookup to forward a packet to any of the Internet routes it has received from RT-B.

Gateway of last resort is to network
B [200/5) via, 00:01:22
B [200/1] via, 00:01:22
B 210.51,225.0/24 [200/1] via, 00:01:24
B [200/1] via 00:01:25
B [200/35] via, 00:01:25
B [200/1] via, 00:01:25
B [200/1) via 00:01:29 …

A problem exists; the next-hop address is not part of ISP-A as-number 1000 and RT-B is not advertising it into AS 1000 via IGP. RT-A, therefore, does not know about this host address via its IGP. Although these internet routes may be in the BGP routing table, they are not injected into the RIB routing table because the next-hop is considered invalid for these networks.

One solution to this problem is to use a BGP command option in order to cause the AS-border router RT-B in AS 1000; to set its own IP-address in the next-hop attribute instead of the IP-address of the external peer. The internal peers RT-A would then receive the routes with a next-hop of which is known via IGP.

router bgp 1000 ;
neighbor next-hop-self

Once this configuration was implemented the error messages previously appearing in the log were not any more present. CPU utilization on the router drastically reduced and the sluggish behavior experienced earlier no longer occurred.                          Capture4


Come back to Tutorial Index”