Quality Time With Your JPEGs

When working with user provided images in PHP you run into a problem. Lets say that you want to generate thumbnails of uploaded JPEGs for users. This is a fairly common use case where you would employ PHP and GD (the most prevalent php image extension.) But when you generate the new, smaller, image what quality setting do you use? If your quality setting is too low then the image is distorted unacceptably. Likewise if your quality setting is too high then you produce a dimensionally smaller image with a larger size in bytes than the original. So what do you do when you want to satisfy all cases? Well the obvious answer is that you should use the same JPEG quality setting that the image had when it was uploading. Now… Using PHP and GD tell me how you accomplish this.

Go ahead

I’ll wait

You can’t, can you? If you’re really sneaky you might be thinking about just pulling the data out of the binary stream, and if you’re a linux nut you’re probably sitting there muttering “just use identify (a la imagemagick)”. Of course if you try that under heavy traffic you’ll soon discover that it kills your servers. Just for reference I’ll share that code with you (not everyone is serving 2g/sec in dynamic image traffic after all). We already have the binary data in ram in the $rval variable, in case you were curious.

So, since this consumes too much of our available resources (especially ram and cpu usage since identify fully decompresses and reads the image into a raw state for processing… hundreds of MB of ram, which you can limit but then it becomes unbearably slow…) that’s out. What now? If you’re really extra sneaky you’re thinking that you should be able to read the setting out of the binary data… there should be a header after all, right? Well… Yes there is a header but “quality” is not a “setting” its more a measure of how compressed the image is… which isn’t exactly recorded either… at least… not as an integer value. The compression matrix used to preform JPEGs lossy compression *IS* stored in a header.. and it turns out this is what the ImageMagick code uses to give us that quality setting. So I set out to reproduce this in PHP.

I know… I’m a masochist.

Thanks to some serious google-fu (and possibly a note in a php online doc manual relating to IPC, I don’t remember which of the two led me to the package first) I found that there’s already some code out there which deals with the nasty bits of reading raw JPEG headers (though not doing what I want with the header that I want) in the PHP JPEG Metadata Toolkit And the instructions for evaluating the header we can then pull is in the ImageMagick source code (coders/jpeg.c)

When we put the two together and modify it a bit to suite our needs (i.e. reading from the in-memory buffer, pulling just the right header from the jpeg file, etc) we get this code… and finally the ability to call $quality = get_jpeg_quality( $rval ); In my test (yea just one or two… very scientific like), this over 100 times faster than using the proc_open and executable method, uses less ram (a lot lot lot lot lot less ram) and generally just doesn’t suck as much.

Right… serious suckage. Which is why I’m sharing it here so that you don’t have to go through all that trouble. You can just steal my stolen code. Aren’t GPL compatible licenses fun?

Ok, finally, you… in the back… stop jumping up and down screaming about the Imagick PECL extension… In my testing getCompressionQuality() didnt work and getImageCompression() was unavailable (though I hear there is a newer version of the extension now… YMMV)

6 thoughts on “Quality Time With Your JPEGs

  1. Pingback: Stephane Daury (stephdau) 's status on Thursday, 17-Sep-09 12:33:49 UTC - Identi.ca

  2. Pingback: Weekly Digest for September 19th | The WebZappr

  3. Lusche

    // error_log( "$table >> $qvalue && $sum >> " . ( $i + 1 ) );

    return $i + 1;

    Plus one is wrong. Tested a couple of images and programs with their quality settings and even ImageMagick IDENTIFY always gives back a value one less than yours. Additionally your arrays don't need 101 elements. ;-)

    Great stuff, tho. Converted it to Pascal.

    Reply

Leave a Reply