When you should (and should not) think about using Amazon EC2

The Amazon AWS team has done it again. And EC2 is generating quite the talk. Perhaps I’ve not been watching the blogosphere closely enough about anything in particular until now (very likely) but I’ve not really seen this much general excitement. The ferver I see going around is alot like a kid at christmas. You unwrap your present. ITS A REMOTE CONTROLLER CAR. WOW! How cool! All of a sudden you have visions of chasing the neighborhood cats, and drag racing your friends on the neighborhood sidewalks. After you open it (and the general euphoria of the ideas start to fade) you realize: this is one of those cars that only turns one direction… And you just *know* that the next time you meet with your best friend bobby he will have a car that turns left *and* right.

I expect we will see some of this… A lot of the talk around the good old sphere is that AWS will be putting smaller hosting companies out of business. But thats not going to happen unless they change their pricing model. Which i doubt they will.

But before all you go getting your panties in a bunch when EC2 only turns left… Remember that EC2 is a tool. And just like you wouldn’t use a hammer to cut cleanly through a board. EC2 is not meant for all purposes… The trick to making genuinely good use of EC2 will be in playing off of its strengths… And avoiding its weaknesses.

Lets face it… The achillies heel of all the rampant early bird speculation is that the price of bandwidth for EC2 is rather high. Most hosting companies get you (with a low end plan) 1000Gb of transfer per month. Amazon charges $200 per month for that speed, whereas you can find low-end hosting for $60, and mid end hosting got $150. Clearly this is not where EC2 excells. And I dont think that the AWS team intended for it to excell here. How big of a headache would it be to run the servers which host every web site on the planet? Not very.

What you *do* get at a *great* price is horsepower. For a mere $74.40/month (assuming 31 days) you get the equivalent of a Xeon 1.75Ghz with 1.75Gb Ram. Thats not bad!

but the real thrill comes with the understanding that additional servers can talk to eachother over the network… for free. There is a private network (or QV) which you can make use of. This turns into a utility computing atom bomb. If you can monimize the amount of bandwidth used getting data back and forth to and from the machine, while maximizing its CPU and RAM utilization, then you have a winning combination which can take full use of the EC@ architecture. And if your setup is already using Amazon’s S3 storage solution… Well… Gravy

Imagine running a site like, say, youtube on EC2. It would kill you with the huge bill. the simple matter of the situation is that youtube uses too much bandwidth in the receiving and serving of its users files. I would have to imaging that the numbers for its bandwidth usage per month are staggering! But lets break out the things that youtube has to manage, and where it might be able to best utilize EC2 in its infrastructure.

Youtube gets files from its users. Converts those files into FLV’s. And then makes those FLV’s available via the internet. You therefore have 3 main actions that are preformed. A) HTTP PUT, B) Video Conversion, and C) HTTP GET. If I were there, and in a position of evaluating where EC2 miht prove useful to me I would probably be recommending the following changes to how things work:

First all incoming files will be uploaded directly to web servers running on EC2 AMIs. Theres no reason it should be uploaded to a Datacenter, and then re-uploaded to EC2, and then sent back down to the Datacenter — that makes no sense. So Users upload to EC2 Servers.

Second the EC2 compute farm is in charge of all video conversion. Video conversion is, typically, a high memory and high cpu usage process (as any video editor will tell you.) And when they built their datacenter I can assure you that this weighed heavily on their minds. You dont want to buy too many servers. You pay for them up front, and you pay for them in back as well. Not only do you purchase X number of servers for your compute farm but you have to be able to run them, and that means rack space and power. Let me tell you that those two commodities are not cheap in datacenters. You do not want to have to have servers sitting around doing nothing unless you have to! So how many servers they purchase and provision every quarter has a lot to do with their expected usage. If they dont purchase enough then the user has to wait for a long time for his requests to complete. Too many and you’re throwing away your investors money (which they dont particularly like.) So the ability to turn on and off servers in a compute farm only when they are needed (and better yet: to only pay for them when they’re on) is a godsend. This will save oodles of cash in the longrun.

At this point, as a side note, I would also be advising keeping long term backups of content in the S3 service. As well as removing rarely viewed content, and storing it in S3 only. This would reduce the amount of space that is needed at any one time in the real physical datacenter. Disks take up lots of power, and lots of space. You dont want to have to pay for storage you dont actually need. The tradeoff here is that transferring the content from S3 back to the DC will cost some money. So the cost of that versus the cost of running the storage hardware (or servers) youselves ends up being. I would caution that you can move from S3 to a SAN, but moving from a SAN to S3 leave you with a piece of junk which costs more than your house did ;D.

Third the EC2 servers upload the converted video file, and thumbnails to the primary (and real) datacenter. And it’s from here that the youtube viewers would be downloading the actual content.

That setup would be when you *DO* use Amazons new EC2 service. You’ve used the strengths of EC2 (unlimited horsepower at a very acceptable price,) while avoiding its weaknesses (expensive bandwidth, and paying for long term storage (unless S3 ended up being economical for what you do))

That said… There are plenty of places where you wouldnt want to use EC2 in a project. Any time you’ll be generating excessive amounts of traffic… you’re loosing money compared to a physical hosting solution.

In the end there is a lot of hype, and theres a lot of room for FUD / Uninformed Opinions (this blog post, for example, is an uninformed opinion — I’ve never used the service personally,) and what people need to keep in mind is that not every problem needs this solution. I would argue that its very likely that any organization could find one or (probably) more very good uses for EC2. But hosting your static content is not one of them. God help the first totally hosted EC2 user who gets majorly slashdotted ;).

I hope you found my uninformed public service anouncement very informative. Remember to vote for me in the next election 😉

cheers
Apok

Google Version Control

http://code.google.com/

Not at all a suprise offering from Stein 😀 I’m sure it will be top notch. (remember it’s still beta right now, of course it isnt feature complete!)

What this could give google is a distinct in-road to the emerging generation of web and application developers. What better way to know who to hire for a programming position than to have their entire development history available at a moments notice. Will this be the beginning of google knocking on the door of candidates that THEY want, rather than candidates seeking google out?

hmm.

6 tips for surviving in online communities

Tip #1: Be self assured. If people questioning your point of view frightens or upsets you
, then you’re probably not ready to participate in an open membership (and especially anonymous) online community. Just keep in mind that you have your points, and whether someone agrees with you or not is simply not important.

Tip #2: Online is not Offline. What someone says online is likely not what someone would have said offline. If you find yourself thinking something along the lines of “I dare you to come down here and say that to my face.” You obviously need to remember that online bullies are a lot like the bullies that you faced in third grade. They talk a big game, but in the grand scheme of things they just arent that important.

Tip #3: Patience. Not everybody is sitting at their PC’s hitting refresh waiting for you to have said something to which they can reply. Give it some time. This obviously changes somewhat depending on the medium… IRC might be a one to fifteen minute delay, where a message forum could be a day, and a mailing list might take several days.

Tip #4: Dont beat a dead horse. This is the tip that I’m, personally, most guilty of infringing upon. Whether people agree with you or not… once you have your opinion out it’s really not worth repeating it over and over and over and over and over… especially in the same threads of conversation. Now if you’re actively engaging in a debate this is less important than when answering a question. But even when debating you should always attempt to spin the same facts in different lights if you *have* to reiterate them.

Tip #5: Put on your fire suite. This tip is an extension or melding of tips 1 and 2. You really have to remember that this stuff is not life or death… It simply doesnt matter. Really. It doesnt. People are eventually going to disagree with you, and thats OK. People are eventually going to berate you, chastize you, mock you, put you down, and generally act like asshats. Let them. At the end of the day they just dont matter!

Tip #6: You dont have to. You are not required to do anything you dont want to. If you find yourself answering the same qquestion again and again… and its making you angry. stop. It’s really simple. Someone else will fill in (or not) but no matter how much you feel that it is… it *isnt* specifically your responsibility. And if you just cant help yourself then write up a FAQ, and simply point to it, then leave it at that. It’s a lot better than blowing a gasket after answering the same “newbie” question 1,500 times in a row.