Bigger Aint Always Better

Recently I was troubleshooting some inefficiencies with the jobs systems locking and fetching queries at work. Like a good little boy I, originally, came up with one index which satisfied all the queries that I needed to run against this particular critical table.

 (`completed`,`heartbeat`,`priority`,`datacenter`,`worker`) 
-- datetime,datetime,tinyint,varchar(16),varchar(255)

the query looked like this:

SELECT `id`,GET_LOCK(CONCAT('foo_',`id`),0) as mylock FROM `foo_jobs` 
WHERE `completed` = '0000-00-00 00:00:00' AND `worker` = '' AND `dirty` = 0 
AND IS_FREE_LOCK(CONCAT('foo_',`id`)) = 1 AND `priority` = '0' LIMIT 1

This worked really well as long as the number of jobs in the table remained small.. But there was an event horizon of sorts… where after a certain number of rows (say 10k-25k) the query above took so long to get rows that the system could no longer catch up, but would fall invariably behind… Apparently, the problem was that there was too much data in the key. Because the requirements for these queries had changed slightly (due to other scaling improvements) I was able to simplify and break the indexes into parts which vastly outperform the previous index.

KEY `get_a_job` (`priority`,`workerpid`)
-- tinyint, int
KEY `janitor` (`completed`,`heartbeat`,`worker`,`dirty`)
-- datetime,datetime,varchar(255),tinyint

This works because workerpid is enough to tell us whether a job definition is in process (or was, and has not been cleaned yet.) and makes the following query run against a super small index…

SELECT `id`,GET_LOCK(CONCAT('foo_',`id`),0) as mylock FROM `foo_jobs`
WHERE `priority` = '0' AND `workerpid` IS NULL 
AND IS_FREE_LOCK(CONCAT('foo_,`id`)) = 1 LIMIT 1       

If you’re interested in what dark monstrosity could possibly require queries like these (and are a brave PHP developer) I invite you to check out the jobs system

Leave a Reply