so-you-wanna-see-an-image

We’ve been asked how we manage serving files from Amazons very cool S3 service at WordPress.com… This is how. (covering a requested image already stored on S3, not the upload -> s3 process)

A request comes into pound for a file. Pound hashes the hostname (via a custom patch which we have not, but may, release) , to determine which of several backend servers the request should hit. Pound forwards the request to that server. This, of course, means that a given blog always serves from the same backend server. The only exception to the afore-mentioned rule is if that server is, for some reason, unavailable in which case it picks another server to serve that hostname from temporarily.

The request then comes into varnishd on the backend servers. The varnishd daemon checks its 300Gb worth of files cache and (for the sake of this example) finds nothing (hey, new images are uploaded all the time!) Varnishd then checks with the web server (running on the same machine, just bound to a different IP/Port#) and that request is handled by a custom script.

So, a http daemon on the same backend server runs the file request. The custom script checks the DB to gather information on the file (specifically which DC’s it is in, size, mod time, and whether its deleted or not) all this info is saved in memcached for 5 minutes. The script increments and checks the “hawtness” (term courtesy of Barry) of the file in memcached (if the file has been accessed over a certain # of times it is then deemed “hawt”, and a special header is sent with the response telling varnishd to put the file into its cache. When that happens the request would be served directly by varnishd in the previous paragraph and never hit the httpd or this script again (or at least not until the cache entry expires.)) At this point, assuming the file should exist (deleted = 0 in the files db) we fetch the file from a backend source.

Which backend source depends on where it is available. The order of preference is as follows: Always fetch from Amazon S3 if the file lives there (no matter what, the following preferences only ever occur if, for some reason, s3 = 0 in the files db), and if that fails fetch from the one files server we still have (which has larger slower disks, and is used for archiving purposes and fault tolerance only)

After fetching the file from the back end… the custom script hands the data and programatically generated headers to the http daemon, which hands the data to varnishd, varnishd hands the data to pound, pound hands the data to the requesting client, and the image appears in the web browser.

And there was much rejoicing (yay.)

For the visual people among us who like visuals and stuff… (I like visuals…) here goes…

Autumn Leaves Leaf #3: Commander

This leaf is capable of running a script on the local server in response to the !deploy channel command. For security you have to authenticate first. To do so you send it a message with a password. it then it http authenticates against a specific url with your nickname and the mesage text as the password. If the file fetched matches predesignated contents then you are added to the internal ACL. Anyone in the ACL can run the !deploy command. If you leave the chan, join the chan, change nicks, or quit irc you will be removed from the ACL and have to re-authenticate. This could be adapted to any system command for any purpose. I ended up not needing this leaf; I still wanted to put it out there since its functional and useful.

require 'net/http'
require 'net/https'

class Commander < AutumnLeaf
  
  before_filter :authenticate, :only => [ :reload, :sync, :quit, :deploy ]
  $authenticated = []

  def authenticate_filter(sender, channel, command, msg, options)
    return true if $authenticated.include?(sender)
    return false
  end

  def did_receive_private_message(sender, msg)
    # assumes there is a file at 
    # http://my.svnserver.com/svn/access 
    # whose contents are "granted" 
    Net::HTTP.start('my.svnserver.com') {|http|
      req = Net::HTTP::Get.new('/svn/access');
      req.basic_auth(sender, msg)
      response = http.request(req)
      $authenticated < < sender if response.body == "granted"
    }
  end

  def someone_did_quit(sender, msg)
    $authenticated.delete(sender)
  end

  def someone_did_leave_channel(sender, channel)
    $authenticated.delete(sender)
  end

  def someone_did_join_channel(sender, channel)
    $authenticated.delete(sender)
  end

  def deploy_command(sender, channel, text)
    message "deploying..."
    system("sudo /usr/local/bin/deploy.sh 1>/dev/null 2>/dev/null")
   end

end

Autumn Leaves Leaf #2: Feeder

This handy little bot keeps track of RSS feeds, and announces in the channel when one is updated. (note: be sure to edit the path to the datafiles) Each poller runs inside its own ruby thread, and can be run on its own independent schedule

require 'thread'
require 'rss/1.0'
require 'rss/2.0'
require 'open-uri'
require 'fileutils'
require 'digest/md5'

class Feeder < AutumnLeaf

def watch_feed(url, title, sleepfor=300)
  message "Watching (#{title}) [#{url}] every #{sleepfor} seconds"
  feedid = Digest::MD5.hexdigest(title)
  Thread.new {
    while true
      begin
        content = ""
        open(url) { |s|
          content = s.read
        }
        rss = RSS::Parser.parse(content, false)
        rss.items.each { |entry|
          digest = Digest::MD5.hexdigest(entry.title)
          if !File.exist?("/tmp/.rss.#{feedid}.#{digest}")
            FileUtils.touch("/tmp/.rss.#{feedid}.#{digest}")
            message "#{entry.title} (#{title}) #{entry.link}"
          end
          sleep(2)
        }
      rescue
        sleep(2)
      end
      sleep(sleepfor)
    end
  }
  sleep(1)
end

def did_start_up
  watch_feed("http://planet.wordpress.org/rss20.xml", "planet", 600)
  watch_feed("http://wordpress.com/feed/", "wpcom", 300)
end

end

Autumn Leaves Leaf #1: Announcer

This bot is perfect for anything where you need to easily build IRC channel notifications into an existing process. It’s simple, clean, and agnostic. Quite simply you connect to a TCP port, give it one line, the port closes, the line given shows up in the channel. eg: echo ‘hello’ | nc -q 1 bothost 22122

require 'socket'
require 'thread'

class Announcer < AutumnLeaf

        def handle_incoming(sock)
                Thread.new {
                line = sock.gets
                message line
                sock.close
                }
        end

        def did_start_up
                Thread.new {
                        listener = TCPServer.new('',22122)
                        while (new_socket = listener.accept)
                                handle_incoming(new_socket)
                        end
                }
        end

end

Anatomy of a mostly-dead network catastrophy

One of the things that you hear over and over again is that “the network is not reliable.”  You hear people say it, blog it, write it down in books, podcast it (I’m sure.) You hear it, you think to yourself “oh… that makes sense…” and you go on your merry way.  You’re developing your web app, and all is well.  You never think about that old saw again… YOUR network is reliable.

Of course it is… its all sitting in one cage.  You have your dedicated high availability pair of managed gigabit switches.  And if the internet connection fails nothing bad happens to your application, it just doesn’t see requests for a while, right? Ut-oh! You’ve blindly wandered into this particularly insideous trap without even knowing it!

Later on your web site is flourishing, traffic is huge, investors are happy.  Memcaching objects took you to the next level (oh no! the trap has teeth!).  The stage is set!  You’ve purchased a second data center.  You have your memcached objects invalidating across the internet, you tested, you deployed, and you’ve just adjusted your DNS. Traffic starts flowing into TWO places at once. All is well, you pat yourself on the back.

Three months later… you’ve been up late… drinking… you’re exhausted and buzzed… its 4:00am… you just got to sleep… And your cell phone goes absolutely haywire. Your baby is dying.

Your httpd connections are all maxed out.  Your caches are out of sync.  Your load average just hit about 50.  In short the sky is falling.  After poking around you realize that you’re seeing 90% packet loss between your two sites.  The http connections are piling up because of the latency involved in the remote memcached invalidations.  Load goes up because the httpd servers are working their butts off and getting nowhere.

Finally it clears up… Go back to sleep right? WRONG.  now your data centers are showing  different data for the same requests!!!  Replication seems to be going fine… AHH memcached.  Those failed sets and deletes… Restart the cache. OH NO! load alerts on the database servers… OH RIGHT… we implimented memcached because it helped out with the db load… makes sense… guess remote-updates/deletes are good but not perfect… what now?

What do you mean what now? You sit and wait for your caches to repopulate from the db, and the httpd connections to stop piling up.  You count your losses, clear everything up, and think long and hard on how to avoid this in the future.

Later on… whose fault was it? It ends up not mattering. Its always an “upstream provider”, or a “peering partner” or a “DOS attack” or some farmer and his back-hoe.  The point is that its not preventable. it will happen again. Thems the breaks.

So what do you do?  Well thats the question isn’t it… I guess it depends on how much cash you have to throw at the problem, the facilities you use, and your application.  But believe me when I give this warning: “Its a hell of a lot harder to think failure early on, but a hell of a lot easier to deal with.”

Between replication, data conflicts, message delivery, message ordering, playing, replaying, and all the other ideas behind the various kinds of fault tolerance there is only one immutable truth:  nothing is ever foolproof.  There is always a single point of failure somewhere if you just look broadly or narrowly enough.  Plan your catastrophes, and choose your battles. Be ready to pick up the pieces.

All that being said… how do *YOU* handle multiple datacenters, disperate networks, writes, synchronization, and caching? I’d love to hear peoples takes on the issue as its an endlessly fascinating subject.

The iPhone… Its not even out yet and everyone is drooling over it

And if they aren’t, they should be!  Ajax has long been the missing link between phones as a mobile computing platform and phones as a simple messaging device.  the fact is that there is a vastly larger poll of people willing to write useful web apps than useful java apps.  I would also argue that it’s easier to write good web apps than java apps of the same magnitude.  So with apples announcement that the iPhone will support web 2.0 standards (read AJAX) what was once a tasty looking new toy has become something more. It’s become a tasty toy with a good enough reason for the cost.   I’d have to pay to break my contract with Sprint, start a contract with Cingular, buy the new iPhone, buy the wife a new phone (shared Sprint plan)…. I’m probably looking at $700-$1000 to make the switch.  And I’m already thinking that its worth it.  I’m going to hold off though… as long as I can stand it.  I want someone to review it, I want to see how the web explosion hits Cingulars networks… I want to see how hard they are to find at first…  Mostly I just want the damn phone really bad… But I’m gonna try to be a good boy and hold off… Maybe

nasty regex

I’m putting this here for documentation purposes… Because getting it right was a very frustrating ordeal (I’d never had to match both positively and negatively in the same regex before)

/^(?(?!.+\.php)^.*(\.jpg|\.jpeg|\.gif|\.ico|\.png)|^$)/s

what this is, essentially, saying is “true if the string doesnt match ^.+\.php and the string matches ^.*(\.jpg|\.jpeg|\.gif|\.ico|\.png)” The last bit: “|^$” never returns true in my case,because we’re matching on URI’s which are always at least one character long ( “/” )