Traditional OPS VS Amazon AWS (part 1)

Lets take a look at your traditional web application setup (during development, or early stages with no funding). You might find something like this. We have a web, database, and specialty server. The specialty server is probably doing whatever makes the company unique. For simplicities sakes we’ll say its just crunching data from the database

The “Server Farm” “starting out”

+------------------------------
|-[ Web Server       ]
|-[ Database Server  ]
|-[ Specialty Server ]
+------------------------------

You bring in a little ad revenue, or a partner with a bit more money, and you expand… You double your capacity!

The “Server Farm” “expanded”

+------------------------------
|-[ Web Server       ]
|-[ Web Server       ]
|-[ Database Server  ]
|-[ Database Server  ]
|-[ Specialty Server ]
|-[ Specialty Server ]
+------------------------------

Eventually you end up with something like this:

  • 2 load ballancers
  • 30 specialty servers
  • 16 web servers
  • 12 database servers

The above is a classic solution to the “we need more power” problem. And it’s worked for a long time. However

  • It’s grossly inefficient as far as computational resource usage goes

    Because you add on 20 new servers in *anticipation* of the load you will need in 4 months. So during month 1 you’re using 10% of those servers, month 2 30%, month 3 60%, and you find out at month 4 that you’re at about 120% utilization (hey having the problem of being successful isnt the worst thing hat could happen… but it sure is stressful!)

  • It’s grossly inneficient as far as personelle resource usage goes

    Because your OPS team (or OPS Guy as the case may be) has twofold problems: First problem is that these machines are being purchased with months between them. And since there’s very little money in a startup they’re being purchased on sale. Which means you get whatever you can afford. Which ultimately means that after a year you have 9 different machine builds and with 7 different configurations, 2 processor types, 6 different network card types, many different RAM latencies, differences in CPU Mhz, disk speed, chasis, power consumtion, etc, etc, etc

    Which means that "ballancing the resources properly" becomes a touch juggling act. That "diagnosing problems possibly due to hardware&quot becomes a long expensive drawn out process. Which means that installing OS’s on machines (for iether repairing dead drives, or repurposing old servers) means having to deal with an entire encyclopedia of quirks/exceptions/nuances/hinderances.

    Trust me that it’s *very likely* that by the 4th batch of hardware bought you’ll start having to deal with the "oh… yea… thats from the second batch… you have to do [whatever] before it’ll work" By the 8th batch its a quagmire of re-learning all those old lessons.

  • It’s grossly inefficient in terms of resource utilization

    If you only use 20% of your web servers abilities… except for 2 days a month when you’re at 80% pushing 90%… You’re loosing a HUGE amount of valuable CPU cycles, RAM, and hard disk space. And what you don’t want your OPS team to be doing is constantly repurposing machines. OPS people are human… eventually they’ll make a mistake. And you wont like it. Plus they have important time consuming jobs aside from having to spend 2 weeks every month re-imaging servers for a 2 day utilization spike, dont they?

Queue dramatic music

Enter the Amazon team

… to be continued

2 thoughts on “Traditional OPS VS Amazon AWS (part 1)

Leave a Reply