Scripting without killing system load

Gravatar

Let us pretend for a moment that you have a critical system which can *just* handle the strain that it’s under (I’m sure all of you have workloads well under your system capabilities, or capabilities well over your workload requirements, of course; still for the sake of argument…) And you have a job to do which will induce more load. The job has to be done. The system has to remain responsive. Your classic response to this problem is adding a delay, for example:

     #!/bin/bash
     cd /foo
     find ./ -type d -daystart -ctime +1 -maxdepth 1 | head -n 500 | xargs -- rm -rv
     while [ $? -eq 0 ]; do
          sleep 60
          find ./ -type d -daystart -ctime +1 -maxdepth 1 | head -n 500 | xargs -- rm -rv
     done

Of course this is a fairly simplistic example. Still it illustrates my point. The problem with this solution is that the machine you’re working on is likely to have a variable workload where its main use comes in surges. By defining a sleep time you have to iether sleep so long that the job takes forever to finish, or skirt with high loads and slow response times. Ideally you would be able to let her rip while the load is low and throttle her back while the load is high, right? Well we can! Like so:

     #!/bin/bash
     function waitonload() {
          loadAvg=$(cat /proc/loadavg | cut -f1 -d'.')
          while [ $loadAvg -gt $1 ]; do
               sleep 1
               echo -n .
               loadAvg=$(cat /proc/loadavg | cut -f1 -d'.')
               if [ $loadAvg -le $1 ]; then echo; fi
          done
     }

     waitonload 1
     find ./ -type d -daystart -ctime +1 -maxdepth 1 | head -n 500 | xargs -- rm -rv
     while [ $? -eq 0 ]; do
          waitonload 1
          find ./ -type d -daystart -ctime +1 -maxdepth 1 | head -n 500 | xargs -- rm -rv
     done

This modification will only run the desired commands when the system load is less than 2, it will wait for that condition to continue the loop. This can be very handy for very large jobs needing to be run on loaded systems. Especially jobs which can be subdivided into small tasks!


Posted on : Aug 20 2007
Posted under Random Thoughts, cli, linux |

4 People have left comments on this post

Aug 20, 2007 - 02:08:41
Gravatar  Joseph Scott said:

Doesn’t batch(1) do roughly the same thing?

Aug 21, 2007 - 02:08:05
Gravatar  apokalyptik said:

I find this solution less complicated. But, yes. It’s doable

Aug 22, 2007 - 09:08:47
Gravatar  Lloyd Budd said:

Awesome insights!

An essentially assumption about this seems to be that the unit of work is small enough that it is unlikely to be caught in a surges.

Aug 22, 2007 - 09:08:25
Gravatar  apokalyptik said:

yea if the work can be divided into small enough tasks that we can do, check, do, check, do, check… then this works great. On processes which are opaque and just plain take a long time… theres no help for those but to renice them and hope it all works out :D