Idealism is a good thing,…

Idealism is a good thing, don't kill it. At the same time over-indulgence makes petty tyrants of the noblest people. so too with intentions.

Some learn by making their…

Some learn by making their own mistakes, and the experience gained can outweigh the cost. It pains us to watch but that is our hard lesson.

give developers time to think.…

give developers time to think. they are your warriors. thought is their blade. you want them sharp and practiced, both warrior and weapon.

For a good developer a…

For a good developer a specific goal makes paradise of a vacuum. Without such a goal the best we can do is randomly thrash about with code.

sometimes us developers need to…

sometimes us developers need to be told to do the simpler thing. not because we are too dumb, but because we are too smart for our own good

Todays lesson: Check your assumptions

It’s easy to make assumptions about what we think is the problem with the speed of our code. Lets say you have a loop like so:

foreach( $big_list_of_things as $thing ) {
	$res1 = somefunc1( $thing );
	$res2 = somefunc2( $thing );
	$res3 = somefunc3( $thing );
	$res4 = somefunc4( $thing );
}

Since its your code you’re pretty sure most all of the time me being spent inside somefunc2 and everything seems pretty fast except somefunc2 which is noticeably slow. Maybe you did something like this to get a kind of anecdotal feel for how long things are taking (I do this kind of thing a lot…)

foreach( $big_list_of_things as $thing ) {
	$res1 = somefunc1( $thing ); echo "1";
	$res2 = somefunc2( $thing ); echo "2";
	$res3 = somefunc3( $thing ); echo "3";
	$res4 = somefunc4( $thing ); echo "4 ";
}

So you even see some slowness on func2 — Solved! well… maybe… Just because func2 is obviously a problem doesn’t mean that there aren’t other problems going on in your code that you weren’t expecting. And it’s not a huge amount of work just to make sure, so we might as well, here I’ll show you how.

$big_list_of_things = range( 1, 100 );

function somefunc1() { usleep( mt_rand(0, 100) ); }
function somefunc2() { usleep( mt_rand(0, 30000) ); }
function somefunc3() { usleep( mt_rand(0, 14000) ); }
function somefunc4() { usleep( mt_rand(0, 150) ); }

$t1 = $t2 = $t3 = $t4 = 0;
foreach( $big_list_of_things as $thing ) {
        $s = microtime(1);  
        $res1 = somefunc1( $thing );
        $t1 += ( microtime(1) - $s ); 
        $s = microtime(1);
        $res2 = somefunc2( $thing );
        $t2 += ( microtime(1) - $s ); 
        $s = microtime(1);
        $res3 = somefunc3( $thing );
        $t3 += ( microtime(1) - $s ); 
        $s = microtime(1);
        $res4 = somefunc4( $thing );
        $t4 += ( microtime(1) - $s );
}
echo "1: $t1, 2: $t2, 3: $t3, 4: $t4
";

Gives me the following output… and oh look… 2 is a problem but maybe 3 needs to be looked into as well. Would I have known about how much time 3 was taking? probably not since it “seemed” so fast…

php ./test.php
1: 0.0077567100524902, 2: 1.4149680137634, 3: 0.70513272285461, 4: 0.013875484466553
php ./test.php
1: 0.0079605579376221, 2: 1.5631670951843, 3: 0.62554883956909, 4: 0.013876676559448
php ./test.php
1: 0.0080087184906006, 2: 1.4883117675781, 3: 0.66886830329895, 4: 0.013206481933594
php ./test.php
1: 0.0074846744537354, 2: 1.5641448497772, 3: 0.64763903617859, 4: 0.012674331665039

Just last night I found and fixed a bug that must have been there for ages that I had no idea about by exactly this method. And in case you were wondering whether or not it’s worth it… Trust me… Sometimes it is…

elockd [more] publicly available

I’d call this an 0.5 release. It’s now in the code repo that we at Automattic use to put out little open source tools.

I’ve fixed several bugs with the code since the first time that I posted about it. It can handle at least 2k active connections, and at least 100k shared and exclusive locks (split 50/50) and can handle every single connection orphaning its locks at the same time for both shared and exclusive locks.

It’s a pretty good bit of code… not bad for under a week in my spare time.

I suppose that I should explain what it is and does. And why I care.

The idea behind lockd was to build a network oriented daemon which would handle arbitrary locking of… things. The idea was to mix the ease of Memcached with the atomic chewy goodness of MySQls named locks. It’s really handy to have something like this that can be depended upon when you have a very large distributed environment to keep things from stepping on its own toes.

Only one client can have an exclusive lock, and only the lock owner can release it. any client can inspect that exclusive lock to see if it’s held or not. If the owner disconnects from the server then all of the owned locks are released.

Any number of clients can acquire a shared lock, and the result of the locking action includes the number of other owners sharing that lock. The response for the first lock request would be 1 lock acquired, while the second lock request for the lock would be 2 lock acquired (i.e. two people have this lock.) Likewise releasing the lock decrements it, and inspecting the lock shows the number of open handles to that lock. All of an owners shared locks are also released on disconnect.

Oh, did I mention that it also keeps stats for you to use in your munin graphs? Yea. That too.

So… some obvious questions I’m sure you’re wondering:

1: why not just use Memcached? Well Memcached has no knowledge of the state of your connection. I want a lock that goes away if the parent process disconnects for any reason. You could do this with timed keys in Memcached but you run two risks: the first being that you might not get around to updating the lock and resetting its ttl before it expires leaving another process able to lock erroneously, and the second being that given enough data flowing through the cache your lock might simply be pushed off the back end of the cache by newer data — something that I don’t want. Also shared locks would be difficult here.

2. Why not just use MySQL named locks? You can only hold one named lock per connection, and there is no concept of shared named locks.

3. Why not use filesystem locks? Those tend to be difficult for people to code properly, depend on various standards implementations, cant do counted shared locks, and most importantly aren’t network enabled.

4. Whats the big deal about shared locks? They’re super powerful — great for rate limiting, etc.

5. Wasn’t there already something out there for this? I’m not going to say “no,” but I will say “not that I saw when I looked”.

6. Why did you rewrite it in erlang, was the PHP version bad? Yes, sort of, the PHP version played some interesting tricks to achieve a thread-like operational state, but I believe that there are timing issues because of those tricks that cause it to crash in as-of-yet unknown circumstances at high load. The PHP version is also slow when there are a very high number of clients or locks. Erlang was, essentially, born for this particular purpose since it sports great concurrency models immutable variables, and the way that the gen_* things work out I get atomicity built in even with all these concurrent clients grabbing at stuff.

7. Whats the API look like? It looks nice and clean…

“g $key
” — get exclusive $key
“r $key
” — release exclusive $key
“i $key
” — inspect exclusive $key
“sg $key
” — get shared $key
“sr $key
” — release shared $key
“si $key
” — inspect shared $key
“q
” — show stats

video of a simple stress test of elockd

in case anyone is curious… elockd-in-action

elockd

I’ve been working on an erlang port of my php locking daemon in erlang (which is a more appropriate language for this kind of thing.) And I have it all tricked out (ok partially tricked out but hey it’s my first real erlang project and i’ve only spent 2 afternoons on it.)

The api is completely the same between the two (read: simple), and it works great (in my tests.) It supports both exclusive and shared locks, orphaning on disconnect works great for both, stats are working, it’s all running under a supervisor to restart anything that stops, I *think* i’ve done the code well enough that hot code swapping should work as expected. I know there’s a lot of “how an erlang application is packaged” stuff that I don’t know yet.

If i had to describe in a one-liner what this does i would say that lockd is MySQL named locks meets Memcached.

I’m kind of annoyed, however, that “erl -s esuper” doesn’t run the stupid thing, I have to actually run esuper:start(). to get it going. I’ll have to figure that out. You would think that running some precompiled beams/modules/functions/args would be super easy from the outside a-la-init-script, and it probably is but I’m missing something.

Comments on the code are welcome. It’s a pretty cool little thing — my [lack of] mad erlang skills aside.

When I’m ready I’ll be testing it in production, and putting it in a public repo.

CodeWord: Apokalyptik

The random things that spew forth from my brain…

Author: apokalyptik