Writing your own shell in php

I’ve always wanted to write my own simple shell in php.  Call me a glutin for punishment, but it seems like something that a lot of people could use to be able to do… If your web app had a command line interface for various things… like looking up stats, or users, or suspending naughty accounts, or whatever…. wouldnt that be cool and useful?  Talk about geek porn.  Anyways this this morning I got around to tinkering with the idea, and here is what i came up with… It’s rough, and empty, but its REALLY easy to extend and plug into any php application.

apokalyptik:~/phpshell$ ./shell.php

/home/apokalyptik/phpshell > hello

hi there

/home/apokalyptik/phpshell > hello world

hi there world

/home/apokalyptik/phpshell > cd ..

/home/apokalyptik/ > cd phpshell

/home/apokalyptik/phpshell > ls

shell.php

/home/apokalyptik//phpshell > exit

apokalyptik:~/phpshell$ ./shell.php

See the source here: shell.phps

Internally Caching Longer Than Externally Caching

We use varnish for a lot of our file caching needs, and recently we figured out how to do something rather important through a combination of technologies. Imagine you have backend servers generating dynamic content based on user input. So your users do something that fits the following categories:

  • is expensive to generate dynamically, and should be served from cache
  • many requests come in for the same objects, bandwidth should be conserved
  • doesnt change very often
  • once changed needs to take effect quickly

Now wish varnish we’ve been using the Expires header for a long time with great success, but for this we were having no luck. If we set the expires header to 3 weeks, then clients also cache the content for 3 weeks (violating requirement #3.) We can kill the Expires header in varnish at vcl_deliver, but then clients don’t cache at all (#2.) We can add Content-Control, overwrite the Age (otherwise reported Age: will be greater than max-age), and kill the Expires headers in the same place, but this isn’t pretty, and seems like a cheap hack. Ideally we could rewrite the Expires header in varnish, but that doesn’t seem doable.

So what we ended up doing, was header rewriting at the load balancer (nginx.) inside our location tag we added the following:

proxy_hide_header Age;
proxy_hide_header Expires;
proxy_hide_header Cache-Control;
add_header Source-Age $upstream_http_Age;
expires  300s;

Now nginx setsa proper Cache-Control: and Expires: headers for us, disregarding what varnish serves out. Web clients dont check back for 5 minutes (reusing the old object) and varnish can cache until judgment dat because we get wild card invalidation

Isn’t technology fun?!

Logging post data

Lets say you have a relatively complex php web application, like wordpress. You have it running under apache (which is common.) You have good control of your site via .htaccess (which is common — permalinks and all.) And something happens to your blog (e.g. someone is exploiting some unknown vulnerability to compromise your content), which you want to track down. So you want to log, for instance, HTTP POST data. Your first instinct might be to add some logging code to index.php (mine was) But there are a lot of possible places which might be directly accessed, especially in the admin. So The trick I use (and this is probably the only time I’ve ever condoned this) is to use PHPs auto_prepend_file functionality.

Make a /home/user/postlog/ directory, then a /home/user/postlog/logs/ directory (and chmod a+rw that one.) Next make a simple /home/user/postlog/postlog.php file with the following contents:

<?php 
if ( isset($_POST) && is_array($_POST) && count($_POST) > 0 ) { 
  $log_dir = dirname( __FILE__ ) . '/logs/'; 
  $log_name = "posts-" . $_SERVER['REMOTE_ADDR'] . "-" . date("Y-m-d-H") . ".log"; 
  $log_entry = gmdate('r') . "\t" . $_SERVER['REQUEST_URI'] . "
" . serialize($_POST) . "

"; 
  $fp=fopen( $log_dir . $log_name, 'a' ); 
  fputs($fp, $log_entry); 
  fclose($fp); } 
?>

Finally add this line to the top of your .htaccess file:

php_value auto_prepend_file /home/user/postlog/postlog.php

If all went well this should start logging any request to any php file with any post data into the /home/user/postlog/logs/ direcory (with a unique log per ip per day)

so-you-wanna-see-an-image

We’ve been asked how we manage serving files from Amazons very cool S3 service at WordPress.com… This is how. (covering a requested image already stored on S3, not the upload -> s3 process)

A request comes into pound for a file. Pound hashes the hostname (via a custom patch which we have not, but may, release) , to determine which of several backend servers the request should hit. Pound forwards the request to that server. This, of course, means that a given blog always serves from the same backend server. The only exception to the afore-mentioned rule is if that server is, for some reason, unavailable in which case it picks another server to serve that hostname from temporarily.

The request then comes into varnishd on the backend servers. The varnishd daemon checks its 300Gb worth of files cache and (for the sake of this example) finds nothing (hey, new images are uploaded all the time!) Varnishd then checks with the web server (running on the same machine, just bound to a different IP/Port#) and that request is handled by a custom script.

So, a http daemon on the same backend server runs the file request. The custom script checks the DB to gather information on the file (specifically which DC’s it is in, size, mod time, and whether its deleted or not) all this info is saved in memcached for 5 minutes. The script increments and checks the “hawtness” (term courtesy of Barry) of the file in memcached (if the file has been accessed over a certain # of times it is then deemed “hawt”, and a special header is sent with the response telling varnishd to put the file into its cache. When that happens the request would be served directly by varnishd in the previous paragraph and never hit the httpd or this script again (or at least not until the cache entry expires.)) At this point, assuming the file should exist (deleted = 0 in the files db) we fetch the file from a backend source.

Which backend source depends on where it is available. The order of preference is as follows: Always fetch from Amazon S3 if the file lives there (no matter what, the following preferences only ever occur if, for some reason, s3 = 0 in the files db), and if that fails fetch from the one files server we still have (which has larger slower disks, and is used for archiving purposes and fault tolerance only)

After fetching the file from the back end… the custom script hands the data and programatically generated headers to the http daemon, which hands the data to varnishd, varnishd hands the data to pound, pound hands the data to the requesting client, and the image appears in the web browser.

And there was much rejoicing (yay.)

For the visual people among us who like visuals and stuff… (I like visuals…) here goes…

Autumn Leaves Leaf #3: Commander

This leaf is capable of running a script on the local server in response to the !deploy channel command. For security you have to authenticate first. To do so you send it a message with a password. it then it http authenticates against a specific url with your nickname and the mesage text as the password. If the file fetched matches predesignated contents then you are added to the internal ACL. Anyone in the ACL can run the !deploy command. If you leave the chan, join the chan, change nicks, or quit irc you will be removed from the ACL and have to re-authenticate. This could be adapted to any system command for any purpose. I ended up not needing this leaf; I still wanted to put it out there since its functional and useful.

require 'net/http'
require 'net/https'

class Commander < AutumnLeaf
  
  before_filter :authenticate, :only => [ :reload, :sync, :quit, :deploy ]
  $authenticated = []

  def authenticate_filter(sender, channel, command, msg, options)
    return true if $authenticated.include?(sender)
    return false
  end

  def did_receive_private_message(sender, msg)
    # assumes there is a file at 
    # http://my.svnserver.com/svn/access 
    # whose contents are "granted" 
    Net::HTTP.start('my.svnserver.com') {|http|
      req = Net::HTTP::Get.new('/svn/access');
      req.basic_auth(sender, msg)
      response = http.request(req)
      $authenticated < < sender if response.body == "granted"
    }
  end

  def someone_did_quit(sender, msg)
    $authenticated.delete(sender)
  end

  def someone_did_leave_channel(sender, channel)
    $authenticated.delete(sender)
  end

  def someone_did_join_channel(sender, channel)
    $authenticated.delete(sender)
  end

  def deploy_command(sender, channel, text)
    message "deploying..."
    system("sudo /usr/local/bin/deploy.sh 1>/dev/null 2>/dev/null")
   end

end

Autumn Leaves Leaf #1: Announcer

This bot is perfect for anything where you need to easily build IRC channel notifications into an existing process. It’s simple, clean, and agnostic. Quite simply you connect to a TCP port, give it one line, the port closes, the line given shows up in the channel. eg: echo ‘hello’ | nc -q 1 bothost 22122

require 'socket'
require 'thread'

class Announcer < AutumnLeaf

        def handle_incoming(sock)
                Thread.new {
                line = sock.gets
                message line
                sock.close
                }
        end

        def did_start_up
                Thread.new {
                        listener = TCPServer.new('',22122)
                        while (new_socket = listener.accept)
                                handle_incoming(new_socket)
                        end
                }
        end

end