I’m always excited when we see something new for amazon web services

http://www.openfount.com/blog/s3infidisk-for-ec2

This certainly looks very interesting! I cant help but wonder if the memory caching in the neterprise version is enough to run small MySQL instances on? At the very least being able to MySQLdump regularly to a file directly on S3 would be useful as opposed to mysqldump to a file, split it into chunks, copy the chunks off to s3.

Perhaps I’ll contact them next week and see if they’ll let me take it for a test drive?!

(PHP code) Gracefully handling the failure of TCP resources

function check_tcp_active($host, $port) {
    $socket = socket_create(AF_INET, SOCK_STREAM, SOL_TCP);
    socket_set_option($socket,
      SOL_SOCKET,
      SO_RCVTIMEO,
       array(
       "sec"=>0,
       "usec"=>500
       )
    );
    $result = @socket_connect($socket, $host, $port);
    if ( $result ) {
      socket_close($socket);
      return(TRUE);
    } else {
      return(FALSE);
    }
  }

  function find_active_server($array) {
    // Format: $array['127.0.0.1']=3306
    if ( is_array($array) ) {
      foreach ( $array as $host => $port ) {
        if ( $this->check_tcp_active($host, $port) ) {
          $rval['host']=$host;
          $rval['port']=$port;
          return($rval);
        }
      }
    }
    return(FALSE);
  }

  $mysqlServers=array(
    '127.0.0.1'    => 3306,
    '192.168.0.10' => 3306,
    '192.168.0.11' => 3306,
    '192.168.0.12' => 3306,
    '192.168.0.13' => 3306,
    '192.168.0.14' => 3306,
  );

  $goodMysqlHost=find_active_server($mysqlServers);

with only a very small amount of work a pseudo random load distribution would be possible. Hope this helps someone 🙂

HA EC2 Part #2: Load Balancing the Load Balancer

Lets first address the problem of the dynamic IP address on the load balancer, because it doesn’t matter how good your EC2-side setup is if your clients can no longer reach your load balancer after a reboot. Also complicated because normally you want two load balancers to act as a fail-over pair in case one of them pops for some reason. Which means that we not only need to have the load balancers register with something somewhere we also need a method of de-registering them if, for some reason, they fail. And since downed machines usually don’t do a good job f anything useful we cannot count on them de-registering themselves unless we’re shutting them down manually. Which we don’t really plan on doing, now, do we?!

So here’s the long and short of the situation. Some piece of it, some starting point has to be outside the cloud. Now I know what you’re thinking: “but he just said we weren’t going to be talking about outside the cloud” but no, no, I did not say that; I said that we weren’t going to be talking about running a full proxy outside the cloud. I read that the EC2 team are working on a better solution for all of this, but for right now it’s in a roll your own state, so lets roll our own, shall we?

The basic building block of any web request is DNS. When you type in www.amazonaws.com your machine automagically checks with DNS servers somewhere, somehow, and eventually gets an IP address like this: 72.21.206.80. Now there can be multiple steps in this process, for example when we looked up www.amazonaws.com it *actually* points to rewrite.amazon.com, and finally rewrite.amazon.com points to 72.21.206.80. And this is a process we’re going to take advantage of. But first, some discussion on the possible ramifications of doing this:

DNS (discussed above) is a basic building block of how the internet works. And as such has had a dramatic amount of code written concerning it over the years. And the one type of code which may cause us grief at this stage is the caching proxy server. Now normally when you look up a name you’re asking your ISP’s DNS servers to look the name up for you, and since it doesn’t know it asks one of the primary name servers which server in the internet handles naming for that domain. once it finds that out it asks, a lot like this: “excuse me pdns1.ultradns.net, what is the address for rewrite.amazon.com?” to which your ISP gets a reply a lot like “The address for rewrite.amazon.com is 72.21.206.80 but thats only valid for 5 minutes.” So for 5 minutes the DNS server is supposed to be allowed to remember that information. So after 4 minutes when you ask again it doesn’t go to the source, it simply spouts off what it found out before. However after 5 minutes it’s supposed to check again… But some DNS servers ignore that amount of time (called a Time To Live (TTL)) and cache that reply for however long they feel like (hours, days, weeks?!) And when this happens a client might not get the right IP address if there has been a change and a naughty caching DNS server refuses to look it up for another week.

Alas, there is nothing we can do to fix that. I only mention it so that people don’t come knocking down my door yelling at me about a critical design flaw when it comes to edge cases. And to caution you: when your instance is a load balancer. It’s *ONLY* a load balancer. Don’t use it to run cron jobs, I don’t care if it’s got extra space and RAM, just leave it be. Because the fewer things happening with your load balancer the fewer chances of something going wrong, and the lower the chance of a new IP address, and the lower the chance of running into the above problem if the IP address doesn’t change, right? right!

So when you choose a DNS service you choose one which meets the following criteria:

  • API, you need scriptable access to your DNS service
  • Low (1-2 minutes) TTL
    (so that when something changes you only have 60 or 120 seconds to wait)

Ideally you will have two load balancer images. LB1 and LB2 (for the sake of me not having to type long names every time). You can do this dynamically (i.e. X number of load balancers off the same image), and if you’re a good enough scriptor to be able to do it, then HOW to do it should be fairly obvious.

When LB1 starts up it will automatically register itself at lb1.example.com via your DNS providers API. It will then check for the existence of lb.example.com, if thats not set then it will create it as pointing to itself. If lb.example.com was previously set it till preform a check (HTTP GET (or even a ping)) to make sure that LB2 (which is currently “active” at lb.example.com) is functional. If LB2 is not functional LB1 registers itself as lb.example.com. LB2 performs the same startup sequence, but with lb1 and lb2 switched where necessary.

Now, at regular intervals (lets say 60 seconds), LB1 checks the health of LB2 and vic a versa. If something happens to one of them the other will, if necessary, register itself at lb.example.com.

Well, I think that basically covers the portion of how this would work outside the EC2 cloud, next I’ll deal with what happens inside the EC2 cloud. (piece not written yet… so it’ll take a bit longer than the last two)

SVN + RoR = Passive Version Controlled Goodness!

While working with both rails and subversion (which I like using to keep my projects under version control) I was irritated by having to go through and add or delete a bunch of files when using the code generation tools. Especially when first putting the project together, there always seemed to be 6 new files to manually add before every commit… So I wrote a script to handle the adding of new, and removing of missing files for a commit.

#!/bin/bash
IGNORE_DIRS="^\?[ ]{6}log$|^\?[ ]{6}logs$|^\?[ ]{6}tmp$"
IFS="
"
for i in $(svn st | grep -Ev $IGNORE_DIRS | grep -E "^\?")
  do
    i=$(echo $i | cut -d' ' -f7-1000)
    svn add $i
done
for i in $(svn st | grep -E "^!")
  do
    i=$(echo $i | cut -d' ' -f7-1000)
    svn del $i
done

Now I just ./rail_svn.sh and then svn ci and everything is always version controlled. Very nice. The only thing you have to watch out for is leaving files laying around (I’ve had a couple commits which, along with code, also involved removing a vim .swp file or two)

I would be willing to bet that this script would be a decent foundation for a passively version-controlled-directory system if anyone were to want to do something like that with svn (think mail stores, or home directories or anything in which files or directories are added or removed often). This is mainly needed because svn was designed to be an active version control system

QMAIL-TOASTER remote redilivery loop problem

I recently switched from my old gentoo server to a new FC5 server. I opted to go with a qmail-toaster setup because, while I’m perfectly capable of manually making my desired qmail+vpopmail setup, I just didn’t want to spend the personal time doing it. So I figured I would give the toaster project a try. And I have to say that I’m fairly impressed.

A lot of the core technological things that it did were done in basically the same way that I would have done them manually (which is bidirectionally gratifying for me) and there are some bells and whistles that are *nice* but I wouldn’t have bothered setting them up on my own (e.g. qmailmrtg graphical log analysis.)

I did (hopefully did and not still do) have one oddball problem with it. After switching over there were certain servers from which I would continuously get the same message over and over from. Everything in my logs showed a successful delivery, and its not as though the messages were stuck in my queue either, the remote servers would actually reconnect and deliver the message again.

Well for a while I had better things to do with my scant time than deal with this one inconvenient (but not critical) issue. Well today I finally cracked. Its probably because I’ve now gotten one particular message something on the order of 30 times now. Thinking about the problem, and examining my logs it seemed that the only time this happened was when a message was processed by simscan for viruses (clamd) and spam (spamd) at the SMTP transmission level. But that was not the complete story because other messages from other servers did not have this problem even though they went through simscan as well.

On a hunch I figured that the sending mail server was probably only designed to wait X number of seconds (or microseconds) after the finished transmission before expecting to get a status code back from my SMTP daemon. If it takes too long then the remote sending server might just assume the connection was lost and re-queue the message for redelivery. So I disabled spam and virus scanning in simscan

#echo ":clam=no,spam=no,spam_hits=12,attach=.mp3:.src:.bat:.pif" \
  > /var/qmail/control/simcontrol
# /var/qmail/bin/simscanmk
# /var/qmail/bin/simscanmk -g
# qmailctl restart

And the problem *seems* to have gone away. I’m not worried about viruses at this point because I’m running OSX as my desktop, and Thunderbird is usually pretty good about spam… so… no loss for me there.

I’m mainly writing this down here so that if someone were to have this problem, and floundering while searching for an answer, they might have a better chance of finding a helpful hint. Searching for things like redelivery and mail loops on google will yield nothing of any value at present.

Cheers
DK

Series: CRM on S3 & EC2, Part2

So we’ve touched a bit on what to look for in your database. The comments made were by no means specific, and the requirements will vary from place to place. But the underlying principals are what are really important there. Now lets move on to something a bit more specific. Backup.

There is an important caveat to this information: Nobody has done this enough to really have a set of scalable one-size-fits-all tools (or a tool chain) fit for the job… You’ll have to be OK with doing some in-house experimentation. And be OK with the idea of maybe making a couple of miss-steps along the way. As is the case with any new (OK new to YOU) technologies there are some things you just have to learn as you go.

To setup a system that is fault tolerant, and to develop a system in which you manage your risks requires a balance of acceptable versus unacceptable trade off situations. Your main types of backups are as follows:

A) Simple local backup. your old stand-by tar and his friends bzip2, gzip, and even compress. They’ve been doing backups since backups were backups (or almost anyhow) and they are good friends. In this kind of a situation they aren’t the whole solution but you can bet your butt that they’re a part of it.

B) Hard-Copy backup. This isn’t what you want, but worth mentioning. This kind of backups consists of hard disks, tapes, CDs DVDs, etc, which are copied to and then physically removed from the machine. The advantage to this type of backup is that you can take them offsite incase of a local disaster, but in an EC2+S3 business there is no such thing as a local disaster. So if you, once per week/month/whatever, just copy down your latest backups from S3 that should suffice.

C) Copy elsewhere backup. This is going to be bread and butter for the bulk of the solution. It’s not the entire solution. But it’s a fairly big piece. In this case S3 is your “elsewhere”

D) Streaming backups. Examples of streaming backups are MySQL’s replication, or pushing data into an Amazon SQS pipe for playback at a later point. Also a key player in what will surely be your ending strategy.

Well that was fun. Learning just enough to be dangerous but not enough to actually do anything… And certainly not enough to answer the question. So lets get to it.

You will have two distinct areas of backup which will be important to you. You have the UI end, and the DB end. Both these sections should be approached with different goals in mind, because the usage pattern on them ends up being different.

The Front End

You’ve no doubt got a development environment setup somewhere, and as you make bug fixes to this environment, or add features, or change layouts to announce your IPO, or whatever you need to push a snapshot to your servers *AND* any new servers you bring up need to have the new UI code and not the old UI code.

For the sake of argument, here, I’ll assume that you have a SVN or CVS server which holds your version-controlled code (you *ARE* using version control right?) So your build process should, ideally, update the stable branch on your Revision Control Server, and send out a message to your UI servers that an update is available. They should then download the new code from RCS to a temporary directory, and once there you pull the fast-move trick:

$ mv public_html public_html.$(date +%s) && mv public_html.new public_html

At this point all of your UI servers received the message at the same time, and update at the same time. Any new server should have, in its startup scripts sometime after the network is brought up, a process which performs the above update before even bringing up your HTTP service.

And that was the easy part… Now for MySQL

As for MySQL, I’ve outlined my thoughts on that here already in my article: MySQL on Amazon EC2 (my thoughts) Which options you choose here depend on a couple of things: First the skill level of the people who will be implementing the entire procedure *AND* the skill level of the people who will be maintaining it (if those people aren’t the same people). But one very serious word of caution: Whatever you do stop thinking of an EC2 instance as 160GB of space for MySQL and start thinking of it as 60Gb (70GB MAX) because backing up something that you do not have the space to copy is a difficult task which normally required bringing things offline — trust me on this.

My gut feeling for you is that if you owned/rented one physical server to be your write server for your database setup. something roughly equal to the specs of the EC2 virtual machine, except with 320Gb of disk space. That would be your best bet for now. You could keep your replication logs around for the entire history of your database… for a while

You also should keep one extra MySQL instance (on EC2 if you like) up and running for the sole purpose of being up to date. You would then periodically turn it off and copy the entire thing up to S3. So that when you had to restore a new instance you would simply copy those files down, assign the server-id, and let it suck everything new down via replication.

The gotcha here is that this wont last forever… at least not on one database. There will come a time, if you get a “lot” of usage, when the process of downing a server copying it, copying it, bringing it up and waiting for replication will become infeasible. It will eventually just stop adding up. It’s at that point you will have to make a couple of careful choices. If you have properly laid out your schema you can pull your single monolithic database apart, distribute it amongst several database clusters, and carry on as you have been. If you have properly laid out your schema in a different way you will be able to assign certain users to certain clusters and simply write a migration tool for moving users and their data around between database clusters. If you have not properly laid out your data you can choose whether to spend time and money re-working your application to make it right. Or you can spend time and money on buying big “enterprise class hardware” and give yourself time to make things right.

Unless you can truly count on being able to bleed money later on. You’ll VERY CAREFULLY consider your schema now. It will make all the difference. And if you end up with 2+TB of data which is completely unmanageable… well don’t say I didn’t warn you… Those kinds of optimizations may seem silly now when you’re only expecting 5-25GB of data but they wont be silly in 2-4 years.

Series: CRM on S3 & EC2, Part1

Danny de Wit wrote in with a request for collaboration on how to best use EC2 and S3 for his new Ruby On Rails CRM application. And I’m happy to oblige.

At this point I dont know much about what he’s doing, so I hope to start rough and open a dialogue with him and work through the excersice over a bit of time.

The story so far

We have a rails front end, a Dabatase backend, EC2, and S3

Well… that was a quick rundown…

Summary of what we will need to accomplish the task on S3 and EC2

First off we will need to be able to think outside the traditional boxes. But I think Danny is open to that. Second we will need to deal with the database itself. Third We have to deal with the issue of dynamic IP addresses. Fourth we have to deal with some interesting administrative glue (monitoring, alerting, responding) Fifth we have to deal with backups. And finally we have to deal with code distribution.

Now, Where do we start?

First we should start with the database. I wont lie to you, most of the challenge in regards to using these services will be centered around the database. We need to examine how it’s laid out, how its accessed, and what our expectations are when it comes to size. Specifically what we need to look for are two main things: A) bottlenecks, and B) data partitioning strategies.

Bottlenecks. We have to examine where we may or may not have trouble as far as data replication goes. Because if we are making hourly backups and we have to bring up another server at the half hour marker we’re going to have to have a strategy in place to bring the data store up to date. And the layout of the database can make this particularly prohibitive or it could make this very easy. And besides… having a bunch of servers doesnt help if they cant stay in sync.

Data partitioning. It’s easy to say “later on we’ll just distribute the data between multiple servers” but unless you’ve planned for a layout which supports this you might have a particularly difficult time doing so without makor reworking on your application. Also data partitioning can be your friend in the speed department as well. If you’re thoughtful about HOW you store your daya you can use the layout itself to your advantage. For example a good schema might actively promote load ballancing where a bad schema will cause excessive load on particular segments. A good schema will actually act as an implied index for your data, and a bad schema will require excessive sorting and indexing

So what now?

So, Danny, the ball is in your court. You have my e-mail address. You have my blog address. Lets get together and talk database before we move forward into the glue.