By now it seems the whole world is aware Skype had major problems last week. For most people, Skype was down. Tom Keating speculated the cause was a Microsoft patch which rebooted massive numbers of machines. He has a point as my machine coincidentally rebooted just before the Skype problem hit.
Many people thought Tom was incorrect but it turns out Tom nailed the problem.
Phil Wolf has an excellent recap of the problem with his thoughts and questions interlaced within.
For example, when commenting on the following from Skype:
Normally Skype’s peer-to-peer network has an inbuilt ability to self-heal, however, this event revealed a previously unseen software bug within the network resource allocation algorithm which prevented the self-healing function from working quickly.
If the bug was within the algorithm, in the Skype client, was the bug ever repaired?If it was repaired, how was the fix propagated?
Brough Turner who has been a TMC columnist for about a decade also has some great thoughts on the matter.
One final point… Skype did not blame Microsoft directly but it seems circumstantially they are saying a Microsoft forced reboot brought the p2p IP communications network to its knees. The point is that for all the resiliency inherent in p2p networks, they too are not perfect as they rely on software and systems not necessarily in a company’s control.