Scripting without killing system load

Let us pretend for a moment that you have a critical system which can *just* handle the strain that it’s under (I’m sure all of you have workloads well under your system capabilities, or capabilities well over your workload requirements, of course; still for the sake of argument…) And you have a job to do which will induce more load. The job has to be done. The system has to remain responsive. Your classic response to this problem is adding a delay, for example:

     #!/bin/bash
     cd /foo
     find ./ -type d -daystart -ctime +1 -maxdepth 1 | head -n 500 | xargs -- rm -rv
     while [ $? -eq 0 ]; do
          sleep 60
          find ./ -type d -daystart -ctime +1 -maxdepth 1 | head -n 500 | xargs -- rm -rv
     done

Of course this is a fairly simplistic example. Still it illustrates my point. The problem with this solution is that the machine you’re working on is likely to have a variable workload where its main use comes in surges. By defining a sleep time you have to iether sleep so long that the job takes forever to finish, or skirt with high loads and slow response times. Ideally you would be able to let her rip while the load is low and throttle her back while the load is high, right? Well we can! Like so:

     #!/bin/bash
     function waitonload() {
          loadAvg=$(cat /proc/loadavg | cut -f1 -d'.')
          while [ $loadAvg -gt $1 ]; do
               sleep 1
               echo -n .
               loadAvg=$(cat /proc/loadavg | cut -f1 -d'.')
               if [ $loadAvg -le $1 ]; then echo; fi
          done
     }

     waitonload 1
     find ./ -type d -daystart -ctime +1 -maxdepth 1 | head -n 500 | xargs -- rm -rv
     while [ $? -eq 0 ]; do
          waitonload 1
          find ./ -type d -daystart -ctime +1 -maxdepth 1 | head -n 500 | xargs -- rm -rv
     done

This modification will only run the desired commands when the system load is less than 2, it will wait for that condition to continue the loop. This can be very handy for very large jobs needing to be run on loaded systems. Especially jobs which can be subdivided into small tasks!

4 thoughts on “Scripting without killing system load

  1. Awesome insights!

    An essentially assumption about this seems to be that the unit of work is small enough that it is unlikely to be caught in a surges.

  2. yea if the work can be divided into small enough tasks that we can do, check, do, check, do, check… then this works great. On processes which are opaque and just plain take a long time… theres no help for those but to renice them and hope it all works out 😀

Leave a Reply