Topics: AIX, Backup & restore, Performance

Using a pipeline

The next part describes a problem where you would want to do a search on a file system to find all directories in it, and to start a backup session per directory found, but not more than 20 backup sessions at once. Usually you would use the "find" command to find those directories, with the "-exec" parameter to execute the backup command. But in this case, it would result in possibly more than 20 active backup sessions at once, which might overload the system.

So, you can create a script that does a "find" and dumps the output to a file first, and then starts reading that file and initiating 20 backups in parallel. But then, the backup can't start, before the "find" command completes, which may take quite a long time, especially if run on a file system with a large number of files. So how do you do "find" commands and backups in parallel? Solve this problem with a pipeline.

Create a pipeline:

# rm -f /tmp/pipe
# mknod /tmp/pipe p
Issue the find command:
# find [/filesystem] -type d -exec echo {} \; > /tmp/pipe
So now you have a command which writes to the pipeline, but can't continue until some other process is reading from the pipeline.

Create another script that reads from the pipe and issues the backup sessions:
cat /tmp/pipe | while read entry
   # Wait until less than 20 backup sessions are active
   while [ $(jobs -p|wc -l|awk '{print $1}') -ge 20 ]
      sleep 5

   # start backup session in the background
   [backup-command] &
   echo Started backup of $entry at `date`
# wait for all backup sessions to end
echo `date`: Backup complete
This way, while the "find" command is executing, already backup sessions are started, thus saving time to wait until the "find" command completes.

If you found this useful, here's more on the same topic(s) in our blog:

UNIX Health Check delivers software to scan Linux and AIX systems for potential issues. Run our software on your system, and receive a report in just a few minutes. UNIX Health Check is an automated check list. It will report on perfomance, capacity, stability and security issues. It will alert on configurations that can be improved per best practices, or items that should be improved per audit guidelines. A report will be generated in the format you wish, and the report includes the issues discovered and information on how to solve the issues as well.

Interested in learning more?