On Thursday, 16th August 2007, the Skype peer-to-peer network became unstable and suffered a critical disruption. The disruption was triggered by a massive restart of our users’ computers across the globe within a very short timeframe as they re-booted after receiving a routine set of patches through Windows Update.
The high number of restarts affected Skype’s network resources. This caused a flood of log-in requests, which, combined with the lack of peer-to-peer network resources, prompted a chain reaction that had a critical impact.
Normally Skype’s peer-to-peer network has an inbuilt ability to self-heal, however, this event revealed a previously unseen software bug within the network resource allocation algorithm which prevented the self-healing function from working quickly. Regrettably, as a result of this disruption, Skype was unavailable to the majority of its users for approximately two days.
When the outage first happened, I wrote, "Since Skype is a P2P network that relies on other peers for the network to function properly, it's possible a Microsoft update is causing a conflict." I also stated, "the major Microsoft updates (Patch Tuesday) that were released yesterday could have something to do with the Skype outage" In the comments section I expound on this theory when I wrote,
"The theory about Microsoft updates causing the Skype problem didn't mean other operating systems weren't affected, such as the Mac.So I guess we have Microsoft to thank for the Skype outage. Or we can blame Skype for not anticipating a worldwide massive reboot of Windows machines. Gee, I could have told them that would happen.
The theory was that a Microsoft update somehow changed the TCP stack or changed how the Microsoft operating system interacted with Skype. Let's assume the vast majority of Windows users set their patch to auto-install. Then let's assume Skype is 80-90% Windows users. That means that a lot of Windows users installed the patch last night and lots of Windows supernodes could be knocked offline.
The side effect would be that without enough Windows supernodes, Mac users would be booted as well. Though you'd think the network would be flexible enough to handle millions of Windows users knocked offline."