Lets say you want to run some command, such as /bin/long-command on a set of directories. And you have a lot of directories. You know it’ll take forever to complete serially, so you want to cook up a way to run these commands in parallel. You know the server CAN handle more than one command at once, but you have no idea how many it can handle without keeling over, and you have thousands of commands to run. Running them all at once backgrounded will kill the system for sure. You COULD try and stagger them and let the delay in overlap be a natural throttle, but sometimes the command completes in one minute and sometimes in 10, so thats not a good idea either. So you decide it would be best to set a process concurrency limit. But what if you set that limit too low? too high? restarting in the middle would be bad… you COULD make some sort of completed log and build into your script a skip for completed files, but why? that doesnt seem so elegant. Your car is good at handling variable speed allowances… it goes fast when you say and slow when you say… maybe we can give a simple bash script a gas pedal? That just might work!
echo '5' > /tmp/threads
for i in $(fileroot/*); do while [ $(pgrep long-command) -ge $(cat /tmp/threads) ] do sleep 1 done ( /bin/long-command $i )& sleep 1 done
Now you can speed it up and throttle it back by adjusting the integer value inside /tmp/threads.
“It was the little old server from Pasadena…”