I do a lot of things for Automattic, and many of the things I do are quite esoteric (for a php developer anyways.) Perl is not my language of choice, but I’ve never balked at a challenge…. just… did it have to be perl? Anyways. We have more than a thousand machines that we track with munin… which means a TON of graphs. munin-update is efficient, taking advantage of all cpus and getting done in the fastest time possible, but munin-graph started taking so long as to be useless (and munin-cgi-graph takes almost a minute to fully render the servers day/week summary page which is completely unacceptable when we’re trying to troubleshoot a sudden, urgent, problem.) So I got to dive in and make it faster…
Step 1: add in this function (which i borrowed from somewhere else)
sub afork (\@$&) {
my ($data, $max, $code) = @_;
my $c = 0;
foreach my $data (@$data) {
wait unless ++ $c < = $max;
die "Fork failed: $!\n" unless defined (my $pid = fork);
exit $code -> ($data) unless $pid;
}
1 until -1 == wait;
}
Step 2: replace this
for my $service (@$work_array) {
process_service ($service);
}
with this
afork(@$work_array, 16, \&process_service);
I also have munin-html and munin-graph running side-by-side
( [ -x /usr/local/munin/lib/munin-graph ] &&
nice /usr/local/munin/lib/munin-graph --cron $@ 2>&1 |
fgrep -v "*** attempt to put segment in horiz list twice" )& $waitgraph=$!
( [ -x /usr/local/munin/lib/munin-html ] && nice /usr/local/munin/lib/munin-html $@; )& $waithtml=$!
wait $waitgraph
wait $waithtml
I did several other, more complicated hacks as well. Such as not generating month and year graphs via cron, letting those render on-demand with munin-cgi-graph
All said we’re doing in under 2.5 minutes what was taking 7 or 8 minutes previously
This is a simplistic use of the pattern that I wrote about in my last post to wait on multiple commands in bash. In essence I have a script which runs a command (like uptime or restarting a daemon) on a whole bunch of servers (think pssh). Anyways… this is how I modified the script to run the command on multiple hosts in parallel. This is a bit simplistic as it runs, say, 10 parallel ssh commands and then waits for all 10 to complete. I’m very confident that someone could easily adapt this to run at a constant concurrency level of $threads… but I didn’t need it just then so I didn’t go that far… As a side note, this is possibly the first time I’ve ever *needed* an array in a bash script… hah…
# $1 is the commandto run on the remote hosts
# $2 is used for something not important for this script
# $3 is the (optional) number of concurrent connections to use
if [ ! "$3" == "" ]
then
threads=$3
else
threads=1
fi
cthreads=0;
stack=()
for s in $servers
do
if [ $cthreads -eq $threads ]; then
for job in ${stack[@]}; do
wait $job
done
stack=()
cthreads=0
fi
(
for i in $(ssh root@$s "$1" )
do
echo -e "$s:\t$i"
done
)& stack[$cthreads]=$!
let cthreads=$cthreads+1
done
for job in ${stack[@]}; do
wait $job
done
You know that you can run something in the background in a bash script with ( command )&, but a coworker recently wanted to run multiple commands, wait for all of them to complete, collect and decide what to do based on their return values… this proved much trickier. Luckily there is an answer
#!/bin/bash
(sleep 3; exit 1)& p1=$!
(sleep 2; exit 2)& p2=$!
(sleep 1; exit 3)& p3=$!
wait "$p1"; r1=$?
wait "$p2"; r2=$?
wait "$p3"; r3=$?
echo "$p1:$r1 $p2:$r2 $p3:$r3"
I’ve been working, gradually, on a project using an sqlite3 database (for its convenience) and found myself missing the clean elegance of wpdb… so I implemented it. It was actually really easy to do, and I figured I would throw it up here for anyone else wishing to use it. The functionality that I build this around is obtainable here: http://php-sqlite3.sourceforge.net/pmwiki/pmwiki.php (don’t freak… its in apt…)
With this I can focus on the sql, which is different enough, and not fumble over function names and such… $db = new sqlite_wpdb($dbfile, 3); var_dump($db->get_results(“SELECT * FROM `mytable` LIMIT 5″));
the code is below… and hopefully not too mangled…
< ?php
class sqlite_wpdb {
var $version = null;
var $db = null;
var $result = null;
var $error = null;
function sqwpdb($file, $version=3) {
return $this->__construct($file, $version);
}
function __construct($file, $version=3) {
$function = "sqlite{$version}_open";
if ( !function_exists($function) )
return false;
if ( !file_exists($file) )
return false;
if ( !$this->db = @$function($file) )
return false;
$this->version = $version;
$this->fquery = "sqlite{$this->version}_query";
$this->ferror = "sqlite{$this->version}_error";
$this->farray = "sqlite{$this->version}_fetch_array";
return $this;
}
function escape($string) {
return str_replace("'", "''", $string);
}
function query($query) {
if ( $this->result = call_user_func($this->fquery, $this->db, $query) )
return $this->result;
$this->error = call_user_func($this->ferror, $this->db);
return false;
}
function array_to_object($array) {
if ( ! is_array($array) )
return $array;
$object = new stdClass();
foreach ( $array as $idx => $val ) {
$object->$idx = $val;
}
return $object;
}
function get_results($query) {
if ( !$this->query($query) )
return false;
$rval = array();
while ( $row = $this->array_to_object(call_user_func($this->farray, $this->result)) ) {
$rval[] = $row;
}
return $rval;
}
function get_row($query) {
if ( ! $results = $this->get_results($query) )
return false;
return array_shift($results);
}
function get_var($query) {
return $this->get_val($query);
}
function get_val($query) {
if ( !$row = $this->get_row($query) )
return false;
$row = get_object_vars($row);
if ( !count($row) )
return false;
return array_shift($row);
}
function get_col($query) {
if ( !$results = $this->get_results($query) )
return false;
$column = array();
foreach ( $results as $row ) {
$row = get_object_vars($row);
if ( !count($row) )
continue;
$column[] = array_shift($row);
}
return $column;
}
}
?>
If you’re running any moderately busy mail server you’re probably using spamassassins spamc/spamd to check for spam because its tons more efficient than piping the mail through the spamassassin cli. Assuming that you do, and that you plan on adding DKIM proxy to the mix to verify and sign emails you need to put things in the right order, to save you some headache here’s what I did:
- smtp|smtps => -o smtpd_proxy_filter=127.0.0.1:10035 # outgoing dkim verify port
- 127.0.0.1:10036 => -o content_filter=spamassassin
- spamassassin => pipe user=nobody argv=/usr/bin/spamc -f -e /usr/sbin/sendmail -oi -f ${sender} ${recipient} # this delivers to the “pickup” service
- pickup => -o content_filter=dksign:127.0.0.1:10037 # outgoing dkim signing port
- 127.0.0.1:10038 => -o content_filter= # the buck stops here
If you arent careful with these (which I wasnt) you’ll end up causing an infinite loop between your filters (which I did). Thus concludes our public service announcement.
I’ve always wanted to write my own simple shell in php. Call me a glutin for punishment, but it seems like something that a lot of people could use to be able to do… If your web app had a command line interface for various things… like looking up stats, or users, or suspending naughty accounts, or whatever…. wouldnt that be cool and useful? Talk about geek porn. Anyways this this morning I got around to tinkering with the idea, and here is what i came up with… It’s rough, and empty, but its REALLY easy to extend and plug into any php application.
apokalyptik:~/phpshell$ ./shell.php
/home/apokalyptik/phpshell > hello
hi there
/home/apokalyptik/phpshell > hello world
hi there world
/home/apokalyptik/phpshell > cd ..
/home/apokalyptik/ > cd phpshell
/home/apokalyptik/phpshell > ls
shell.php
/home/apokalyptik//phpshell > exit
apokalyptik:~/phpshell$ ./shell.php
See the source here: shell.phps
This is something that has always annoyed me about bash scripts… The fact that it’s difficult to run
/path/to/script.sh --foo=bar -v -n 10 blah -one='last arg'
So I decided to write up a bash function that let me easily (once the function was complete) access this type of information. And because I like sharing, here it is:
#!/bin/bash
function getopt() {
var=""
wantarg=0
for (( i=1; i< =$#; i+=1 )); do
lastvar=$var
var=${!i}
if [ "$var" = "" ]; then
continue
fi
echo \ $var | grep -q -- '='
if [ $? -eq 0 ]; then
## -*param=value
var=$(echo \ $var | sed -r s/'^[ ]*-*'/''/)
myvar=${var%=*}
myval=${var#*=}
eval "${myvar}"="'$myval'"
else
echo \ $var | grep -E -q -- '^[ ]*-'
if [ $? -eq 0 ]; then
# -*param$
var=$(echo \ $var | sed -r s/'^[ ]*-*'/''/)
eval "${var}"=1
wantarg=1
else
echo \ $var | grep -E -- '^[ ]*-'
if [ $? -eq 0 ]; then
# the current one has a dash, so cannot be
# the argument to the last parameter
wantarg=0
fi
if [ $wantarg -eq 1 ]; then
# parameter argument
val=$var
var=$lastvar
eval "${var}"="'${val}'"
wantarg=0
else
# parameter
if [ "${!var}" = "" ]; then
eval "${var}"=1
fi
wantarg=0
fi
fi
fi
done
}
OIFS=$IFS; IFS=$(echo -e "\n"); getopt $@; IFS=$OIFS
now at this point (assuming the above command line parameter and script) I should have access to the following variables: $foo (“bar”) $v (1) $n (10) $blah (1) $one (“last arg”), like so:
OIFS=$IFS; IFS=$(echo -e "\n"); getopt $@; IFS=$OIFS
echo -e "
foo:\t$foo
v:\t$v
n:\t$n
blah:\t$blah
one:\t$one
"
You might be curious about this line:
OIFS=$IFS; IFS=$(echo -e "\n"); getopt $@; IFS=$OIFS
IFS is the variable that tells bash how strings are separated (and mastering its use will go a long way towards enhancing your bash scripting skills.) Anyhow, by default IFS=" " which normally is OK, but in our case we dont want "last arg" to be two seperate strings, but one. I cannot put the IFS assignment inside the function because by that point bash has already split the variable, it needs to be done at a level of the script in which $@ has not been touched yet. So I store the current IFS variable in $OIFS (Old IFS) and set IFS to a newline character. After running the function we reassign IFS to what it was beforehand. This is because I dont know what you might be doing with your IFS. There are lots of reasons you might have already assigned it to something else, and I wouldnt want to break your flow. So we do the polite thing.
And in case the above gets munged for some reason you can see the plain text version here: bash-getopt/getopt.sh
Anyways, hope this helps someone out. If not it's still here for me when *I* need it
We use dirname() a lot in php to make relative paths work from multiple locations like so. The advantages are many:
require dirname( dirname( __FILE__ ) ) . '/required-file.php';
$data = file_get_contents( dirname(__FILE__).'/data/info.dat');
But in bash we often dont do the same thing, we settle for the old standby “../”. Which is a shame because unless your directory structure is set up exactly right, and you have proper permissions, and you run the command from the right spot, it doesnt work as planned. I think part of the reason is that its not obvious how to reliably get a full path to the script from inside itself. Another reason is that ../ is shorter to type and easier to remember. Finally there’s always one time scripts for which this methodology is overkill. But if you’re planning to write a script which other people will (or might) be using, I think it’s good practice to do it right. Googling for things you’d think to search for on this subject does not yeild very informative results, or incomplete (incorrect) methods… so… here’s how to do the above php in bash:
source $(dirname $(dirname $(readlink -f $0)))/required-file.sh
data=$(cat $(dirname $(readlink -f $0))/data/info.dat)
Hope this helps someone
As a side note, the OSX readlink binary functions differently. You’ll want to use a package manager to install gnu coreutils, and iether use greadlink, or link greadlink to a higher precedence location on your $PATH (I have /opt/local/bin:/opt/local/sbin: at the beginning of my $PATH)
Using some stuff I’ve covered in the past on my blog here’s a simple way to put up a daytime server (well to put any service onto a tcp port. I haven’t looked into its bi-directional capabilities yet, this was just sort of a proof-of-concept…
$ apt-get install ipsvd
$ wget http://blog.apokalyptik.com/files/daemonize/daemonize.c
$ cc daemonize.c -o daemonize
$ ./daemonize /var/run/daytime.pid /var/log/daytime.log 'tcpsvd 0 13 date'
start/stop and/or monit script are an extremely short jump from there… And kind of trivial/menial… so I leave that as an exercise to you… if you care
Andy bogged a piece of advice that I have him which I got from Barry… and if you want to know how to get the true absolute path to the real location of the current script is from inside of it (like phps realpath and __FILE__) I suggest you check it out