Incron is an interesting piece of software for Linux, that can monitor for file changes in a specific folder, and can act upon those file changes. For example, it's possible to wait for files to be written in a folder, and have a command run to process these files.
Incron is not installed by default and is part of the EPEL repository. For Red Hat and CentOS 7, it's also possible to just download the RPM package from https://dl.fedoraproject.org/pub/epel/7/x86_64/Packages/i/incron-0.5.12-11.el7.x86_64.rpm, for example using wget.
To install incron, run:
# yum -y install /path/to/incron*rpm
There are 4 files important for incron:
- /etc/incron.conf - The main configuration file for incron, but this file can be left configured as default.
- /usr/sbin/incrond - This is the incron daemon that will have to run for incron to work. You can simply start it by executing this command, and it will automatically run in the background. When it's no longer needed, you can simply kill the process of /usr/sbin/incrond. However, its better to enable the service as system boot time and start the service:
# systemctl enable incrond.service
# service incrond start
- /var/log/cron - This is the default location where the incron daemon will log its activities (through rsyslog). The file is also used by the cron daemon, so you may see other messages in this file. By using the tail command on this file, you can monitor what the incron daemon is doing. For example:
# tail -f /var/log/cron
- The incrontab file - You can edit this file by running:
# incrontab -e
This command will automatically load the incrontab file in an editor like VI, and you can add/modify/remove entries this way. Once you save the file, its contents will be automatically activated by the incron daemon. To list the entries in the incrontab file, run:
# incrontab -l
There's a specific format to the entries in the incrontab file mentioned above, and the format looks like this:
[path] [mask] [command]
Where:
- [path] is the folder that the incron daemon will be monitoring for any new files (only in the folder itself, not in any sub-folders).
- [mask] is the activity that the incron daemon should respond to. There are several different available activities to choose from. For a list of options, see https://linux.die.net/man/5/incrontab. One option that can be used is "IN_CLOSE_WRITE", which means, act if a file is closed for writing, meaning, writing to a file in the folder has been completed.
- [command] is the command to be run by the incron daemon when a file activity takes place in the monitored path. For this command you can use available wildcards, such as:
- $@ : watched filesystem path
- $# : event-related file name
An example of the incrontab file can be:
/path/to/my/folder IN_CLOSE_WRITE /path/to/script.bash $@ $#
You can have multiple entries in the incrontab file, each on a separate line. In the example above, the incron daemon will start script /path/to/script.bash with two parameters (the path of the monitored folder, and the name of the file that was written to the folder), for each file that has been closed for writing in folder /path/to/my/folder.
To monitor the status of the incron daemon, run:
# service incrond status
To restart the incron daemon, run:
# service incrond stop
# service incrond start
Or shorter:
# service incrond restart
There is a downside to using incron, which is, that there is no way to limit the number of processes that can be started by the incron daemon. If a thousand files are written to the folder monitored by the incron daemon, then it will kick off the defined proces in the incrontab file for that folder a thousand times. This may place some serious CPU load on a system (or even hang up the system), especially if the command being run is CPU and/or memory intensive.
IBM has implemented a new feature implemented for JFS2 filesystems to prevent simultaneous mounting within PowerHA clusters.
While PowerHA can give concurrent access of volume groups to multiple systems, mounting a JFS2 filesystem on multiple nodes simultaneously will cause filesystem corruption. These simultaneous mount events can also cause a system crash, when the system detects a conflict between data or metadata in the filesystem and the in-memory state of the filesystem. The only exception to this is mounting the filesystem read-only, where files or directories can't be changed.
In AIX 7100-01 and 6100-07 a new feature called "Mount Guard" has been added to prevent simultaneous or concurrent mounts. If a filesystem appears to be mounted on another server, and the feature is enabled, AIX will prevent mounting on any other server. Mount Guard is not enabled by default, but is configurable by the system administrator. The option is not allowed to be set on base OS filesystems such as /, /usr, /var etc.
To turn on Mount Guard on a filesystem you can permanently enable it via /usr/sbin/chfs:
# chfs -a mountguard=yes /mountpoint
/mountpoint is now guarded against concurrent mounts.
The same option is used with crfs when creating a filesystem.
To turn off mount guard:
# chfs -a mountguard=no /mountpoint
/mountpoint is no longer guarded against concurrent mounts.
To determine the mount guard state of a filesystem:
# lsfs -q /mountpoint
Name Nodename Mount Pt VFS Size Options Auto Accounting
/dev/fslv -- /mountpoint jfs2 4194304 rw no no
(lv size: 4194304, fs size: 4194304, block size: 4096, sparse files: yes,
inline log: no, inline log size: 0, EAformat: v1, Quota: no, DMAPI:
no, VIX: yes, EFS: no, ISNAPSHOT: no, MAXEXT: 0, MountGuard: yes)
The /usr/sbin/mount command will not show the mount guard state.
When a filesystem is protected against concurrent mounting, and a second mount attempt is made you will see this error:
# mount /mountpoint
mount: /dev/fslv on /mountpoint:
Cannot mount guarded filesystem.
The filesystem is potentially mounted on another node
After a system crash the filesystem may still have mount flags enabled and refuse to be mounted. In this case the guard state can be temporarily overridden by the "noguard" option to the mount command:
# mount -o noguard /mountpoint
mount: /dev/fslv on /mountpoint:
Mount guard override for filesystem.
The filesystem is potentially mounted on another node.
Reference:
http://www-01.ibm.com/support/docview.wss?uid=isg3T1018853On Linux, you can use the watch command to run a specific command repeatedly, and monitor the output.
Watch is a command-line tool, part of the Linux procps and procps-ng packages, that runs the specified command repeatedly and displays the results on standard output so you can watch it change over time. You may need to encase the command in quotes for it to run correctly.
For example, you can run:
# watch "ps -ef | grep bash"
The "-d" argument can be used to highlight the differences between each iteration, for example to highlight the time changes in the ntptime command:
# watch -d ntptime
By default, the command is run every two seconds, although this is adjustable with the "-n" argument. For example, to run the uptime command every second:
# watch -n 1 uptime
Iperf is a command-line tool that can be used to diagnose network speed related issues, or just simply determine the available network throughput.
Iperf measures the maximum network throughput a server can handle. It is particularly useful when experiencing network speed issues, as you can use Iperf to determine what the maximum throughput is for a server.
First, you'll need to install iperf.
For AIX:
Iperf is available from http://www.perzl.org/aix/index.php?n=Main.iperf. Download the RPM file, for example iperf-2.0.9-1.aix5.1.ppc.rpm to your AIX system. Next install it:
# rpm -ihv iperf-2.0.9-1.aix5.1.ppc.rpm
For Red Hat Enterprise Linux:
You'll first need to install EPEL, as Iperf is not available in the standard Red Hat repositories. For example for Red Hat 7 systems:
# yum -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
Next, you'll have to install Iperf itself:
# yum -y install iperf
Now that you have Iperf installed, you can start testing the connection between two servers. So, you'll need to have at least two servers with Iperf installed.
On the server you wish to test, launch Iperf in server mode:
# iperf -s
That will the server in listening mode, and besides that, nothing happens. The output will look something like this:
# iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 16.0 KByte (default)
------------------------------------------------------------
[ 4] local 198.51.100.5 port 5001 connected with 198.51.100.6 port 59700
[ ID] Interval Transfer Bandwidth
[ 4] 0.0-10.0 sec 9.76 GBytes 8.38 Gbits/sec
On the other server, connect to the first server. For example, if your first server is at IP address 198.51.100.5, run:
# iperf -c 198.51.100.5
After about 10 seconds, you'll see output on your screen showing the amount of data transferred, and the available bandwidth. The output may look something like this:
# iperf -c 198.51.100.5
------------------------------------------------------------
Client connecting to 198.51.100.5, TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[ 3] local 198.51.100.6 port 59700 connected with 198.51.100.5 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.0 sec 9.76 GBytes 8.38 Gbits/sec
You can run multiple tests while the server Iperf process is listening on the first server. When you've completed your test, you can CTRL-C the running server Iperf command.
For more information, see the official Iperf site at
iperf.fr.
The standard tool for cluster monitoring is clstat, which comes along with PowerHA SystemMirror/HACMP. Clstat is rather slow with its updates, and sometimes the required clinfo deamon needs restarting in order to get it operational, so this is, well, not perfect. There's a script which is also easy to use. It is written by PowerHA/HACMP guru Alex Abderrazag. This script shows you the correct PowerHA/HACMP status, along with adapter and volume group information. It works fine on HACMP 5.2 through 7.2. You can download it here: qha. This is version 9.06. For the latest version, check www.lpar.co.uk.
This tiny but effective tool accepts the following flags:
- -n (show network interface info)
- -N (show interface info and active HBOD)
- -v (show shared online volume group info)
- -l (log to /tmp/qha.out)
- -e (show running events if cluster is unstable)
- -m (show status of monitor app servers if present)
- -1 (exit after first iteration)
- -c (CAA SAN / Disk Comms)
For example, run:
# qha -nev
It's useful to put "qha" in /usr/es/sbin/cluster/utilities, as that path is usually already defined in $PATH, and thus you can run qha from anywhere.
A description of the possible cluster states:
- ST_INIT: cluster configured and down
- ST_JOINING: node joining the cluster
- ST_VOTING: Inter-node decision state for an event
- ST_RP_RUNNING: cluster running recovery program
- ST_BARRIER: clstrmgr waiting at the barrier statement
- ST_CBARRIER: clstrmgr is exiting recovery program
- ST_UNSTABLE: cluster unstable
- NOT_CONFIGURED: HA installed but not configured
- RP_FAILED: event script failed
- ST_STABLE: cluster services are running with managed resources (stable cluster) or cluster services have been "forced" down with resource groups potentially in the UNMANAGED state (HACMP 5.4 only)
This is a quick and dirty method of setting up an LPP source and SPOT of AIX 5.3 TL10 SP2, without having to swap DVD's into the AIX host machine. What you basically need is the actual AIX 5.3 TL10 SP2 DVD's from IBM, a Windows host, and access to your NIM server. This process basically works for every AIX level, and has been tested with versions up to AIX 7.2.
If you have actual AIX DVD's that IBM sent to you, create ISO images of the DVD's through Windows, e.g. by using MagicISO. Or, go to Entitled Software Support and download the ISO images there.
SCP these ISO image files over to the AIX NIM server, e.g. by using WinSCP.
We need a way to access the data in the ISO images on the NIM server, and to extract the filesets from it (see IBM Wiki).
For AIX 5 systems and older:
Create a logical volume that is big enough to hold the data of one DVD. Check with "lsvg rootvg" if you have enough space in rootvg and what the PP size is. In our example it is 64 MB. Thus, to hold an ISO image of roughly 4.7 GB, we would need roughly 80 LPs of 64 MB.
# /usr/sbin/mklv -y testiso -t jfs rootvg 80
Create filesystem on it:
# /usr/sbin/crfs -v jfs -d testiso -m /testiso -An -pro -tn -a frag=4096 -a nbpi=4096 -a ag=8
Create a location where to store all of the AIX filesets on the server:
# mkdir /sw_depot/5300-10-02-0943-full
Copy the ISO image to the logical volume:
# /usr/bin/dd if=/tmp/aix53-tl10-sp2-dvd1.iso of=/dev/rtestiso bs=1m
# chfs -a vfs=cdrfs /testiso
Mount the testiso filesystem and copy the data:
# mount /testiso
# bffcreate -d /testiso -t /sw_depot/5300-10-02-0943-full all
# umount /testiso
Repeat the above 5 steps for both DVD's. You'll end up with a folder of at least 4 GB.
Delete the iso logical volume:
# rmfs -r /testiso
# rmlv testiso
When you're using AIX 7 / AIX 6.1:
Significant changes have been made in AIX 7 and AIX 6.1 that add new support for NIM. In particular there is now the capability to use the loopmount command to mount iso images into filesystems. As an example:
# loopmount -i aixv7-base.iso -m /aix -o "-V cdrfs -o ro"
The above mounts the AIX 7 base iso as a filesystem called /aix.
So instead of going through the trouble of creating a logical volume, creating a file system, copying the ISO image to the logical volume, and mounting it (which is what you would have done on AIX 5 and before), you can do all of this with a single loopmount command.
Make sure to delete any left-over ISO images:
# rm -rf /tmp/aix53-tl10-sp2-dvd*iso
Define the LPP source (From the
NIM A to Z redbook):
# mkdir /export/lpp_source/LPPaix53tl10sp2
# nim -o define -t lpp_source -a server=master -a location=/export/lpp_source/LPPaix53tl10sp2 -a source=/sw_depot/5300-10-02-0943-full LPPaix53tl10sp2
Check with:
# lsnim -l LPPaix53tl10sp2
Rebuild the .toc:
# nim -Fo check LPPaix53tl10sp2
For newer AIX releases, e.g. AIX 7.1 and AIX 7.2, you may get a warning like:
Warning: 0042-354 c_mk_lpp_source: The lpp_source is missing a
bos.vendor.profile which is needed for the simages attribute. To add
a bos.vendor.profile to the lpp_source run the "update" operation
with "-a recover=yes" and specify a "source" that contains a
bos.vendor.profile such as the installation CD. If your master is not
at level 5.2.0.0 or higher, then manually copy the bos.vendor.profile
into the installp/ppc directory of the lpp_source.
If this happens, you can either do exactly what it says, copy the installp/ppc/bos.vendor.profile file from your source DVD ISO image into the installp/ppc directory of the LPP source. Or, you can remove the entire LPP source, then copy the installp/ppc/bos.vendor.profile form the DVD ISO image into the directory that contains the full AIX software set (in the example above: /sw_depot/5300-10-02-0943-full), and then re-create the LPP source. That should help to avoid the warning.
If you ignore this warning, then you'll notice that the next step (create a SPOT from the LPP source) will fail.
Define a SPOT from the LPP source:
# nim -o define -t spot -a server=master -a location=/export/spot/SPOTaix53tl10sp2 -a source=LPPaix53tl10sp2 -a installp_flags=-aQg SPOTaix53tl10sp2
Check the SPOT:
# nim -o check SPOTaix53tl10sp2
# nim -o lppchk -a show_progress=yes SPOTaix53tl10sp2
Within TSM (or nowadays known as IBM Spectrum Protect), filespaces may exist that are no longer backed up. These are file systems that were once backed up, but are no longer backed anymore.
This may occur if someone deletes a file system from a client, and thus it is no longer backed up. Or a file system was added to the exclude list, so it's no longer included in any backup runs.
These old filespaces may use up quite some storage, and because they're never being backed up anymore, their data remains on the TSM server for restore purposes.
It's good practice to review these filespaces and to determine if they can be deleted from TSM to free up storage space. And thus it's a good idea to put this in a script, and have that script run from time to time automatically, for example by scheduling it in the crontab on a weekly or monthly basis.
Here's a sample script. You can run it on your AIX TSM server. It assumes you have a "readonly" user account configured within TSM, with a password of "readonly". It will send out an email if any obsolete filespace are present. You have to update the email variable at the beginning of the script to whatever email address you want to send an email.
#!/bin/ksh
email="my@emailaddress.com"
y=$(perl -MPOSIX -le 'print strftime "%D",localtime(time-(60*60*24))')
mytempfile=/tmp/myadmintempfile.$$
rm -f ${mytempfile}
dsmadmc -comma -id=readonly -password=readonly q filespace \* \* f=d | \
grep -v $(date +"%m/%d/%y") | grep -v "${y}" | grep ",," > ${mytempfile}
if [ -s ${mytempfile} ] ; then
cat ${mytempfile} | mailx -s "Filespaces not backed up during last 24 \
hours." ${email} >/dev/null 2>&1
fi
rm -f ${mytempfile}
exit 0
The script will send an email with a list of any filespaces not backed in the last 24 hours, if any are found.
The next thing you'll have to do is to investigate why the file system is not backed up. If you've determined that the filespace is no longer required in TSM, then you can move forward by deleting the filespace from TSM.
For example, for a UNIX file system:
delete filespace hostname /file/system
For file systems of Windows clients, deleting a filespace may be a bit more challenging. TSM might not allow you to remove a filespace and exit with an error message like "No matching file space found".
You might attempt to delete the filespace like this:
delete filespace nodename \\nodename\d$ nametype=uni
Or, you can remove Windows filespaces in TSM by filespace number. In that case, first list the known filespaces in TSM of a specific nodename, for example:
q filespace nodename *
This will list all the filespaces known in TSM for host "nodename". Replace nodename with whatever client you're searching for. In the output you'll see the hostanme, the filespace name, and a filespace number, for example, 1.
To delete filespace with filespace number "1" for host "nodename", you can run:
delete filespace nodename 1 nametype=fsid
An easy command to check on the Spectrum Protect / TSM server what the backup status of all the Spectrum Protect / TSM clients is using the "q event" command. For example:
q event * * begind=-1 begint=09:00 endd=today endt=09:00
The command above will display the status of all the backups jobs in the last 24 hours between 9 AM yesterday and 9 AM today.
There are numerous show commands available for IBM Spectrum Protect / TSM, that will display information about the environment. Many of them aren't as well documented, probably because IBM intends to use these commands for their own support.
Quite a lot of these commands have been documented by Spectrum Protec / TSM users, and an example can be found on the following web site: http://www.mm-it.at/de/TSM_Show_Commands.html.
A very interesting show command, that can be used to display the amount of deduplicate bytes pending removal, is the following command:
tsm: TSM>show deduppending file_disk
ANR1015I Storage pool FILE_DISK has 7,733,543,532,121 duplicate bytes pending removal.
The command above shows the number of byes for storage pool "FILE_DISK" still to be removed by the dedupe processes.
The command may take quite some time to run, up to 10 minutes, so please be patient when issuing this command.
Creating a snapshot of a logical volume, is an easy way to create a point-in-time backup of a file system, while still allowing changes to occur to the file system. Basically, by creating a snapshot, you will get a frozen (snapshot) file system that can be backed up without having to worry about any changes to the file system.
Many applications these days allow for options to "freeze" and "thaw" the application (as in, telling the application to not make any changes to the file system while frozen, and also telling it to continue normal operations when thawed). This functionality of an application can be really useful for creating snapshot backups. One can freeze the application, create a snapshot file system (literally in just seconds), and thaw the application again, allowing the application to continue. Then, the snapshot can be backed up, and once the backup has been completed, the snapshot can be removed.
Let's give this a try.
In the following process, we'll create a file system /original, using a logical volume called originallv, in volume group "extern". We'll keep it relatively small (just 1 Gigabyte - or 1G), as it is just a test:
# lvcreate -L 1G -n originallv extern
Logical volume "originallv" created.
Next, we'll create a file system of type XFS on it, and we'll mount it.
# mkfs.xfs /dev/mapper/extern-originallv
# mkdir /original
# mount /dev/mapper/extern-originallv /original
# df -h | grep original
/dev/mapper/extern-originallv 1014M 33M 982M 4% /original
At this point, we have a file system /original available, and we can start creating a snapshot of it. For the purpose of testing, first, create a couple of files in the /original file system:
# touch /original/file1 /original/file2 /original/file3
# ls /original
file1 file2 file3
Creating a snapshot of a logical volume is done using the "-s" option of lvcreate:
# lvcreate -s -L 1G -n originalsnapshotlv /dev/mapper/extern-originallv
In the command example above, a size of 1 GB is specified (-L 1G). The snapshot logical volume doesn't have to be the same size as the original logical volume. The snapshot logical volume only needs to hold any changes to the original logical volume while the snapshot logical volume exists. So, if there are very little changes to the original logical volume, the snapshot logical volume can be quite small. It's not uncommon for the snapshot logical volume to be just 10% of the size of the original logical volume. If there are a lot of changes to the original logical volume, while the snapshot logical volume exists, you may need to specify a larger logical volume size. Please note that large databases, in which lots of changes are being made, are generally not good candidates for snapshot-style backups. You'll probably have to test in your environment if it will work for your application, and to determine what a good size will be of the snapshot logical volume.
The name of the snapshot logical volume in the command example above is set to originalsnapshotlv, using the -n option. And "/dev/mapper/extern-originallv" is specified to indicate what the device name is of the original logical volume.
We can now mount the snapshot:
# mkdir /snapshot
# mount -o nouuid /dev/mapper/extern-originalsnapshotlv /snapshot
# df -h | grep snapshot
/dev/mapper/extern-originalsnapshotlv 1014M 33M 982M 4% /snapshot
And at this point, we can see the same files in the /snapshot folder, as in the /original folder:
# ls /snapshot
file1 file2 file3
To prove that the /snapshot file system remains untouched, even when the /original file system is being changed, let's create a file in the /original file system:
# touch /original/file4
# ls /original
file1 file2 file3 file4
# ls /snapshot
file1 file2 file3
As you can see, the /original file system now holds 4 files, while the /snapshot file system only holds the original 3 files. The snapshot file system remains untouched.
To remove the snapshot, a simple umount and lvremove will do:
# umount /snapshot
# lvremove -y /dev/mapper/extern-originalsnapshotlv
So, if you want to run backups of your file systems, while ensuring no changes are being made, here's the logical order of steps that can be scripted:
- Freeze the application
- Create the snapshot (lvcreate -s ...)
- Thaw the application
- Mount the snapshot (mkdir ... ; mount ...)
- Run the backup of the snapshot file system
- Remove the snapshot (umount ... ; lvremove ... ; rmdir ...)
Number of results found: 469.
Displaying results: 71 - 80.