Lets first address the problem of the dynamic IP address on the load balancer, because it doesn’t matter how good your EC2-side setup is if your clients can no longer reach your load balancer after a reboot. Also complicated because normally you want two load balancers to act as a fail-over pair in case one of them pops for some reason. Which means that we not only need to have the load balancers register with something somewhere we also need a method of de-registering them if, for some reason, they fail. And since downed machines usually don’t do a good job f anything useful we cannot count on them de-registering themselves unless we’re shutting them down manually. Which we don’t really plan on doing, now, do we?!
So here’s the long and short of the situation. Some piece of it, some starting point has to be outside the cloud. Now I know what you’re thinking: “but he just said we weren’t going to be talking about outside the cloud” but no, no, I did not say that; I said that we weren’t going to be talking about running a full proxy outside the cloud. I read that the EC2 team are working on a better solution for all of this, but for right now it’s in a roll your own state, so lets roll our own, shall we?
The basic building block of any web request is DNS. When you type in www.amazonaws.com your machine automagically checks with DNS servers somewhere, somehow, and eventually gets an IP address like this: 72.21.206.80. Now there can be multiple steps in this process, for example when we looked up www.amazonaws.com it *actually* points to rewrite.amazon.com, and finally rewrite.amazon.com points to 72.21.206.80. And this is a process we’re going to take advantage of. But first, some discussion on the possible ramifications of doing this:
DNS (discussed above) is a basic building block of how the internet works. And as such has had a dramatic amount of code written concerning it over the years. And the one type of code which may cause us grief at this stage is the caching proxy server. Now normally when you look up a name you’re asking your ISP’s DNS servers to look the name up for you, and since it doesn’t know it asks one of the primary name servers which server in the internet handles naming for that domain. once it finds that out it asks, a lot like this: “excuse me pdns1.ultradns.net, what is the address for rewrite.amazon.com?” to which your ISP gets a reply a lot like “The address for rewrite.amazon.com is 72.21.206.80 but thats only valid for 5 minutes.” So for 5 minutes the DNS server is supposed to be allowed to remember that information. So after 4 minutes when you ask again it doesn’t go to the source, it simply spouts off what it found out before. However after 5 minutes it’s supposed to check again… But some DNS servers ignore that amount of time (called a Time To Live (TTL)) and cache that reply for however long they feel like (hours, days, weeks?!) And when this happens a client might not get the right IP address if there has been a change and a naughty caching DNS server refuses to look it up for another week.
Alas, there is nothing we can do to fix that. I only mention it so that people don’t come knocking down my door yelling at me about a critical design flaw when it comes to edge cases. And to caution you: when your instance is a load balancer. It’s *ONLY* a load balancer. Don’t use it to run cron jobs, I don’t care if it’s got extra space and RAM, just leave it be. Because the fewer things happening with your load balancer the fewer chances of something going wrong, and the lower the chance of a new IP address, and the lower the chance of running into the above problem if the IP address doesn’t change, right? right!
So when you choose a DNS service you choose one which meets the following criteria:
- API, you need scriptable access to your DNS service
- Low (1-2 minutes) TTL
(so that when something changes you only have 60 or 120 seconds to wait)
Ideally you will have two load balancer images. LB1 and LB2 (for the sake of me not having to type long names every time). You can do this dynamically (i.e. X number of load balancers off the same image), and if you’re a good enough scriptor to be able to do it, then HOW to do it should be fairly obvious.
When LB1 starts up it will automatically register itself at lb1.example.com via your DNS providers API. It will then check for the existence of lb.example.com, if thats not set then it will create it as pointing to itself. If lb.example.com was previously set it till preform a check (HTTP GET (or even a ping)) to make sure that LB2 (which is currently “active” at lb.example.com) is functional. If LB2 is not functional LB1 registers itself as lb.example.com. LB2 performs the same startup sequence, but with lb1 and lb2 switched where necessary.
Now, at regular intervals (lets say 60 seconds), LB1 checks the health of LB2 and vic a versa. If something happens to one of them the other will, if necessary, register itself at lb.example.com.
Well, I think that basically covers the portion of how this would work outside the EC2 cloud, next I’ll deal with what happens inside the EC2 cloud. (piece not written yet… so it’ll take a bit longer than the last two)
Any suggestions on DNS providers that have good API's?
Only a couple, really. I'd love to have an expanded list so if anybody and everybody were willing to post their thoughts on good DNS providers for this type of application..
I know that, on the upper end of the scale UltraDNS (ultradns.net) and through word of mouth I know that DNS Made Easy (dnsmadeeasy.com) offers a low TTL and an API. I dont have many details on that one though.
Past that I've mainly run my own DNS using BIND, in the past so, I don't have a good lay of the DNS provider Land. And I dont have any experience wrapping BIND in an API, though there might be an open source project out there which does just that, or an OSS replacement for bind which natively offers an API.
dnsmadeeasy's domain failover service looks very interesting for this. It does the monitoring for you, and changes over the domain to a backup server if necessary — you just set the TTL really low (eg. 180 seconds) for the domain, and they monitor every 2-4mins, so you could be down for a max of 7 mins (ignoring the propogation problems / caching etc.) I reckon this could work pretty well with EC2… especially if you can actually change the IP of the backup server via an api too.
I've just set the service up, but it does look pretty good so far.
Very interesting! I hope you'll send some of your findings on the DNS service my way?
Certainly will! Am in the process of setting all the DNS stuff up at the mo. I love the idea of being able to keep firing up instances as a backup if the main server fails.
Actually I'm quite optimistic about using the DNS failover in this way. What I'm now less sure about is using S3 as the backup, to store home directories, etc. Loading in all the volatile data (eg. html files, php, etc) from S3 to the EC2 seems to take hours — far too long to make it viable as a switch-on-and-go backup 🙁 Though each time I test it, it seems to get faster… which is at least encouraging!
Except that the default TTL of nslookups in a JVM is *forever*, so unless your clients start their JVMs with java.net.ttl=1000 (or whatever) you're in trouble. As long as your clients arent JVMs or Java applets (whose hostname caches cannot be changed for security reasons) this is an non issue.
oooh interesting caveat. (I'm not a java guy personally…)
Its documented on Sun's site:
http://java.sun.com/j2se/1.5.0/docs/guide/net/pro…
networkaddress.cache.ttl (default: -1)
networkaddress.cache.negative.ttl (default: 10)
The infinite cache was added to stop long lived applets getting access to hosts behind a firewall through DNS abuse, but it stays in apps too. really old jvms (1.2 and maybe 1.3) saved negative dns entries forever. this is bad news as a temporary outage of a DNS server would never be recovered from.
TTLs -they are there for a reason, and it is the duty of client side apps to remember them.
I wonder if proxies have similar behaviour? Interesting thought…
Yes, theres a reason that DNS has been used for so long: It's robust, and it works.
This is the kind of deviation from standards which make life difficult in large paradigm shifts like this. I'm sure that whoever implemented this thought to themselves "a DNS entry worth hard coding is not likely to ever change anyway…" However now that we're moving into the compute-as-a-service era the above statement no longer holds anywhere near true.
I expect that this will eventually be addressed as services like EC2 become more widespread in use especially now that JAVA is going OSS. (This is exactly the kind of "bug" fixing that OSS is exceptionally well suited to deal with)
Instead of the extra addresses lb1.example.com and lb2.example.com, wouldn't it be possible that each load balancer just regularly checks whether lb.example.com works, and when it doesn't, register its own IP via the DNS API?