Category Archives: PHP

Longest Common Prefix Between Two Strings

While working on a “for fun” side project I needed to get the longest common prefix of two arbitrary strings. Since I didn’t find what I was looking for in the PHP string functions I made my own. And for some perverse reason i decided to see what the fewest number of lines I could do it in was without displaying warnings…

I got down to a one line function, and managed to avoid using an @ to silence anything…


function str_common_prefix( $s1, $s2, $i=0 ) {
return ( !empty($s1{$i}) && !empty($s2{$i}) && $s1{$i} == $s2{$i} ) ? str_common_prefix( $s1, $s2, ++$i ) : $i;
}

I’d be interested to see what others come up with.

Using PHP and OpenSSH with username/password auth

It turns out that this is actually a tricky problem. It’s super easy to use the OpenSSH command line stuff via PHP when you have key based authentication set up, but it’s not at all easy to use when you want to go the user/pass route. This is for a couple of reasons:

First you cannot specify the password on the command line. Second you cannot use the php process controls directly to give the password (well this isn’t 100% true, if you want to recompile your PHP binary with pty support then you probably could bypass everything I’m about to say and just use proc_open straight). And there’s a third reason that I’ll get to in a bit.

OpenSSH supports getting a password from an executable program via the SSH_ASKPASS environment variable — with two notable gotchas. First this only works if you also specify a DISPLAY environment variable, and second it does NOT work if the controlling process has a tty or pty.

The code below works… but if you just paste it into a script file and run it directly with php ./myscript.php it fails. Why?

The answer is that you are running it, and you’re running it from a shell which has a tty or pty attached, and the script inherits those and OpenSSH sees this and then ignores the SSH_ASKPASS variable completely. The trick to testing/using this code is to execute it without running it from a terminal. If you execute the command below (roughly) it looks like it hangs. but if you open another terminal you should be able to use the port forward just created.

ssh myserver.com “php /path/to/myscript.php”

This is because since you’re sshing for the sole purpose of running that command and not using a shell the system doesn’t allocate you a pty like it would normally. I’m guessing that executing it via an http request would do something similar.

Since I previously posted some hard-won advice for working with ssh2_* in php I thought I would share this equally tricky bit that I’d also figured out in the process.

Remember that there is ALWAYS more than one way to skin a problem, and every problem can be skinned with enough effort.

Again, this is itch-scratch-ware, YMMV, this is meant as a starting point on a journey to a solution and not a drop-in-works-everywhere bit of code. It’s just the hard stuff. The useful stuff is still up to you ;)

PHP SSH2 code

I’ve had a need to use the PHP SSH2 PECL recently (working on making a product, at work, more efficient) And thought I would share some of the preliminary code. You can find it here: vpssh.phps

The most interesting thing is not vpssh_core or it’s exec (though it’s good code) the really interesting thing is the vpssh_tunnel class and the accompanying examples at the top of the file. This really shows some advanced usage of ssh2_tunnel that you can’t really find anywhere else.

It’s just the beginnings of some useful code, but it’s probably a huge jumping off point for anyone seriously looking into the ssh2 pecl functionality. Oh, it also works with both password and key based authentication.

This code is less than 12 hours old, and it works for me so far, YMMV. Feedback welcome … or not… Whatever. Hope it helps someone out. And I hope it helps me out later when I need this kind of thing again.

php debugging the really really hard way

If you’re ever in a situation where something is only happening intermittently, and only on a live server, and only while it’s under load… Lets say its not generating any error_log or stderr output, and you cant run it manually to reproduce… (we’ve all been in this situation) How do you get any debugging output at all?

Step 1: add this to the top of your entry point php file

Step 2: use curl on the localhost to make the request

Step 3: (this assumes your error log is /tmp/php-error-output) run the following command in a second (root) terminal window

Good luck…

Quality Time With Your JPEGs

When working with user provided images in PHP you run into a problem. Lets say that you want to generate thumbnails of uploaded JPEGs for users. This is a fairly common use case where you would employ PHP and GD (the most prevalent php image extension.) But when you generate the new, smaller, image what quality setting do you use? If your quality setting is too low then the image is distorted unacceptably. Likewise if your quality setting is too high then you produce a dimensionally smaller image with a larger size in bytes than the original. So what do you do when you want to satisfy all cases? Well the obvious answer is that you should use the same JPEG quality setting that the image had when it was uploading. Now… Using PHP and GD tell me how you accomplish this.

Go ahead

I’ll wait

You can’t, can you? If you’re really sneaky you might be thinking about just pulling the data out of the binary stream, and if you’re a linux nut you’re probably sitting there muttering “just use identify (a la imagemagick)”. Of course if you try that under heavy traffic you’ll soon discover that it kills your servers. Just for reference I’ll share that code with you (not everyone is serving 2g/sec in dynamic image traffic after all). We already have the binary data in ram in the $rval variable, in case you were curious.

So, since this consumes too much of our available resources (especially ram and cpu usage since identify fully decompresses and reads the image into a raw state for processing… hundreds of MB of ram, which you can limit but then it becomes unbearably slow…) that’s out. What now? If you’re really extra sneaky you’re thinking that you should be able to read the setting out of the binary data… there should be a header after all, right? Well… Yes there is a header but “quality” is not a “setting” its more a measure of how compressed the image is… which isn’t exactly recorded either… at least… not as an integer value. The compression matrix used to preform JPEGs lossy compression *IS* stored in a header.. and it turns out this is what the ImageMagick code uses to give us that quality setting. So I set out to reproduce this in PHP.

I know… I’m a masochist.

Thanks to some serious google-fu (and possibly a note in a php online doc manual relating to IPC, I don’t remember which of the two led me to the package first) I found that there’s already some code out there which deals with the nasty bits of reading raw JPEG headers (though not doing what I want with the header that I want) in the PHP JPEG Metadata Toolkit And the instructions for evaluating the header we can then pull is in the ImageMagick source code (coders/jpeg.c)

When we put the two together and modify it a bit to suite our needs (i.e. reading from the in-memory buffer, pulling just the right header from the jpeg file, etc) we get this code… and finally the ability to call $quality = get_jpeg_quality( $rval ); In my test (yea just one or two… very scientific like), this over 100 times faster than using the proc_open and executable method, uses less ram (a lot lot lot lot lot less ram) and generally just doesn’t suck as much.

Right… serious suckage. Which is why I’m sharing it here so that you don’t have to go through all that trouble. You can just steal my stolen code. Aren’t GPL compatible licenses fun?

Ok, finally, you… in the back… stop jumping up and down screaming about the Imagick PECL extension… In my testing getCompressionQuality() didnt work and getImageCompression() was unavailable (though I hear there is a newer version of the extension now… YMMV)

php open locking daemon

Don’t you hate that… When it’s 2:00am… and you really should be in bed… But your mind has hold of a problem, and wont let it go. I have a project where it would be really handy for a process to be able to lock (arbitrary string identifier) and for another process to be able to check whether (arbitrary string identifier) is still locked. And the processes that do the locking can die… so the lock really needs to expire when they do. I could use MySQLs get_lock but I’m already abusing the hell out of that for more distributed things (and since you cannot have more than one mysql named lock at a time per connection, i don’t think it would work here…) in the originating processes, and these locks are machine wide, not network wide…

I don’t like flock because you have to actually create a file to try and lock it leaving race conditions and the possibility of orphaned files on the file system which just sucks… I thought about Memcached but I really need something that can be held open for long periods of inactivity and released if the client dies which precludes the infinite and the timed method of memcached value storage…

After some searching I found old — Open Lock Daemon which looked like a super good fit… Until I dug into the communication protocol… What a nightmare for wanting to lock a string… srsly. So not being able to find anything (and apparently not being able to sleep until I had a satisfactory answer) I decided to write one. In PHP, naturally. Weighing in at 180 lines I think it’s a pretty acceptable/workable first pass.

[ edit: code available here ]

Continue reading

Bigger Aint Always Better

Recently I was troubleshooting some inefficiencies with the jobs systems locking and fetching queries at work. Like a good little boy I, originally, came up with one index which satisfied all the queries that I needed to run against this particular critical table.

the query looked like this:

This worked really well as long as the number of jobs in the table remained small.. But there was an event horizon of sorts… where after a certain number of rows (say 10k-25k) the query above took so long to get rows that the system could no longer catch up, but would fall invariably behind… Apparently, the problem was that there was too much data in the key. Because the requirements for these queries had changed slightly (due to other scaling improvements) I was able to simplify and break the indexes into parts which vastly outperform the previous index.

This works because workerpid is enough to tell us whether a job definition is in process (or was, and has not been cleaned yet.) and makes the following query run against a super small index…

If you’re interested in what dark monstrosity could possibly require queries like these (and are a brave PHP developer) I invite you to check out the jobs system

Just what you need to know to write a CouchDB reduce function

Lets say you have the CouchDB classes (located here) all compiled together and included into your test.php script. Lets also say that you have created a database with the built-in web ui called “testing”. Finally let us say that your test.php has the following code in it, which would add a record to the db every time it is run. (i know that the data in the document serves no useful purpose… but really I just want to figure out this map/reduce thing so that I can make awesome views… so this suffices sufficiently.)

After running the code a bunch of times you would end up with a bunch of documents which look more or less like this:

picture-1(click for full size)

Now lets say you want to write a view that told you what the first characters of the _id were and how many documents share that first letter. This is analogous to the following in MySQL

Your map function is easy, because you dont have any selection criteria, so we process all rows

The reduce function is where the actual programming comes in… And it seems there aren’t many well explained examples of exactly how to do this (I just brute forced it by trial and error)

and in code, using a temporary view, ( if you used this view all the time you would want to make it permanent… but this is about how to lay out a reduce function, nothing more ) so request code that looks like this

would give you output that looks like this:

I hope this helps somebody out.

random php… a multi-channel chat rooom class using memcached for persistence

why? i dunno… just because… just a toy…

no sql, no flat file, no write permissions required anywhere, no fuss