Skype failure explanation leaves unanswered questions

Technology

So the Skype outage was caused by a bug being exposed by many computers rebooting at much the same time.

I'll take Skype's explanation at face value for now, but it leaves me wondering about several issues.

What was different about this month's Patch Tuesday that it triggered the outage? Sure, there were more patches than usual, and OSes from Windows 2000 to Vista and versions of Office from 2000 to 2007 were included, but that's not a convincing explanation.

According to Microsoft, "there were no issues introduced by the security updates themselves" and "there is nothing unusual in this month's release that could have contributed to this situation." In particular, this includes the need to reboot, the size of the updates and the speed with which they were distributed through Automatic Update.

Staying with the idea that Microsoft's monthly update was the trigger (though not the cause) of the outage, if the updates rolled out on Tuesday August 14, why wasn't Skype affected until Thursday August 16? And since Skype users are spread around the world, why didn't the differences in time zones help protect against a widespread outage? It's not as if everyone in the world suddenly decided to reboot their PCs.

One possibility is that - for some reason we may never really understand - just enough people tried to log in to Skype at a time when P2P resources were just below the level (relative to the number of login requests) that caused the flaw in the Skype software to surface for the first time. And as Skype explained, this caused a snowball effect: if people couldn't log in, no extra P2P resources could join the network. As more people switched on their computers as they arrived at work or returned home and tried to use Skype, the number of pending login attempts increased.

Whatever the exact sequence of events, it would be interesting to learn how this bottleneck was eventually cleared. Is it possible that Skype's August 17 release of version 3.5.0.214 for Windows was installed by enough users sufficiently quickly to get the overall situation back below that critical threshold? It's possible, as things did start to stabilise that day. The fly in the ointment is that none of the changes listed in the release notes for 3.5.0.214 appear relevant.



SPONSORED PRESS RELEASES

NetSuite Announces APAC Channel Sales Program
NetSuite Inc. (NYSE: N), a leading vendor of cloud computing business management software suites, today announced the launch of the NetSuite Channel Program, a major new channel program tailored for the Asia Pacific region. The new NetSuite Channel Program will enable solution pr...

Featured IT jobs

Senior Software consultant responsible for providing support on a unique enterprise level software solution for various customers, Melbourne based!
Skills Tags:   IT  ITIL  Linux  Management  RFP  Unix
This financial client has an excellent opportunity for an experienced Database Developer. SQL 2005 Some Schema design + SSIS & SSRS - 80k+super
Skills Tags:   Design  Development  SQL  SQL Server
Massive Hyperion Project requires a Hyperion Planning Architect / Lead Developer - drive home a huge Hyperion solution.
Skills Tags:   Architect  Design  Development  Hyperion
OBIEE Consultant to work on a very large greenfield OBIEE implementation to date to work end-to-end with excellent modelling & BI Server skills
Skills Tags:   Business Intelligence  Cognos  Hyperion  Informatica  Oracle  SQL

Editors Picks

Stories you may have missed 

What iTWire offers for free

E - mail News SMS Headlines Desktop Alerts News Feeds Job Alerts Technology Events Press-Releases