David Heath
Tuesday, 08 November 2011 22:15
Business IT -
Networking
Page 1 of 2
A bug in a large number of Juniper Internet routers led to significant outages all across the world. What actually happened?
At around 12:15am Australian Eastern Daylight Time, backbone routing organisations (particularly in North America and Europe) started seeing significant outages in their connections.
For instance, to view a discussion amongst the various Internet backbone companies, go
here and follow the "next message" links (note that the topic name changes a few times).
It very quickly became clear that only Juniper routers were affected and they all seemed to be performing a core dump while processing what seemed to be an excessively large BGP router table update.
The outage was very obvious in the
DownRightNow logs for Gmail (for instance) although the incident may have scrolled off the display by the time readers look at the site.
Various commenters are pointing to two very significant postings that relate to this issue.
Firstly, we have Juniper's Bulletin PSN-2011-08 which is available via
Pastebin but seems only available to subscribers in the Juniper site as iTWire was unable to locate it.
Dated 8th August 2011 this Bulletin appears to describe the exact circumstances of this morning's crash - that a certain set of circumstances, "
an MX Series router may crash upon receipt of very specific and unlikely route prefix install/delete actions, such as a BGP routing update. The set of route prefix updates is non-deterministic and exceedingly unlikely to occur. Junos versions affected include 10.0, 10.1, 10.2, 10.3, 10.4 prior to 10.4R6, and 11.1 prior to 11.1R4. The trigger for the MPC crash was determined to be a valid BGP UPDATE received from a registered network service provider, although this one UPDATE was determined to not be solely responsible for the crashes. A complex sequence of preconditions is required to trigger this crash. Both IPv4 and IPv6 routing prefix updates can trigger this MPC crash."
Perhaps even more significant is a report from January 2010. Read about it on the
next page.