After several years of incredible growth, Netflix realized last year that their “three tier, database oriented architecture” would no longer suffice. With over 10 billion requests in the month of November with peaks of 10,000 requests per second, “We started to hit that point where every layer of our software stack needs to be able to scale horizontally”. This decision to re-architect caused them to rethink whether or not to host their own data center, and they eventually moved to a cloud model with Amazon Web Services (AWS).
They offered several reasons for this move. First, as stated, they had to make a major change anyway. Secondly, they saw the age-old benefit of letting their engineers focus on the intellectual property that made Netflix competitive rather than take on the heavy lifting of running an infrastructure. Finally, they acknowledged they had not been successful in forecasting demand. With their member-based web site and software products across the Xbox, PS3, Wii, AppleTV, iPhone, and iPad, they realized the need to scale quickly and nimbly, and the cloud architecture lent itself to such requirements.
Netflix acknowledged that the migration is difficult and requires real commitment. They cited some of their lessons learned in a post on their tech blog.
- They had to unlearn old habits of building and deploying. For instance, they had to design around higher rates of hardware failure which is more common in this architecture and eliminate chatty APIs that suffered from inherent latency of the cloud.
- The inherent shared resource model of the cloud can introduce variance in throughput at any level of the stack. Netflix found that you must either be willing to abandon any specific subtask, or manage resources within AWS to avoid co-tenancy.
- Design each distributed system to expect and tolerate failure from other systems on which it depends.
- Research and development in a scaled down sandbox environment is not as effective in the cloud environment. Use real scale.
Neflix partnered with O’Reilly last month and presented a webinar on their experience. It was recorded and posted here at O’Reilly’s site. If you can suffer through some bad audio, it’s worth a view.