Topics: AIX, Monitoring, Networking, Red Hat, Security, System Admin

Determining type of system remotely

If you run into a system that you can't access, but is available on the network, and have no idea what type of system that is, then there are few tricks you can use to determine the type of system remotely.

The first one, is by looking at the TTL (Time To Live), when doing a ping to the system's IP address. For example, a ping to an AIX system may look like this:

# ping
PING ( 56(84) bytes of data.
64 bytes from ( icmp_seq=1 ttl=253 time=0.394 ms
TTL (Time To Live) is a timer value included in packets sent over networks that tells the recipient how long to hold or use the packet before discarding and expiring the data (packet). TTL values are different for different Operating Systems. So, you can determine the OS based on the TTL value. A detailed list of operating systems and their TTL values can be found here. Basically, a UNIX/Linux system has a TTL of 64. Windows uses 128, and AIX/Solaris uses 254.

Now, in the example above, you can see "ttl=253". It's still an AIX system, but there's most likely a router in between, decreasing the TTL with one.

Another good method is by using nmap. The nmap utility has a -O option that allows for OS detection:
# nmap -O -v | grep OS
Initiating OS detection (try #1) against (
OS details: IBM AIX 5.3
OS detection performed.
Okay, so it isn't a perfect method either. We ran the nmap command above against an AIX 7.1 system, and it came back as AIX 5.3 instead. And sometimes, you'll have to run nmap a couple of times, before it successfully discovers the OS type. But still, we now know it's an AIX system behind that IP.

Another option you may use, is to query SNMP information. If the device is SNMP enabled (it is running a SNMP daemon and it allows you to query SNMP information), then you may be able to run a command like this:
# snmpinfo -h -m get -v sysDescr.0
sysDescr.0 = "IBM PowerPC CHRP Computer
Machine Type: 0x0800004c Processor id: 0000962CG400
Base Operating System Runtime AIX version: 06.01.0008.0015
TCP/IP Client Support  version: 06.01.0008.0015"
By the way, the example for SNMP above is exactly why AIX Health Check generally recommends to disable SNMP, or at least to dis-allow providing such system information trough SNMP by updating the /etc/snmpdv3.conf file appropriately, because this information can be really useful to hackers. On the other hand, your organization may use monitoring that relies of SNMP, in which case it needs to be enabled. But then you stil have the opportunity of changing the SNMP community name to something else (the default is "public"), which also limits the remote information gathering possibilities.

Topics: Monitoring, PowerHA / HACMP

Cluster status webpage

How do you monitor multiple HACMP/PowerHA clusters? You're probably familiar with the clstat or the xclstat commands. These are nice, but not sufficient when you have more than 8 HACMP/PowerHA clusters to monitor, as it can't be configured to monitor more than 8 clusters. It's also difficult to get an overview of ALL clusters in a SINGLE look with clstat. IBM included a clstat.cgi in HACMP 5 to show the cluster status on a webpage. This still doesn't provide an overview in a single look, as the clstat.cgi shows a long listing of all clusters, and it is just like clstat limited to monitoring only 8 clusters.

The HACMP/PowerHA cluster status can be retrieved via SNMP (this is actually what clstat does too). Using the IP addresses of a cluster and the snmpinfo command, you can remotely retrieve cluster status information, and use that information to build a webpage. We've written a script for this purpose. By using colors for the status of the clusters and the nodes (green = ok, yellow = something is happening, red = error), you can get a quick overview of the status of all the HACMP/PowerHA clusters.

Per cluster you can see: the cluster name, the cluster ID, HACMP version and the status of the cluster and all its nodes. It will also show you where any resource groups are active.

You can download the script here. This is version 1.6. Untar the file that you download. There is a README in the package, that will tell you how you can configure the script. This script has been tested with HACMP version 4, 5, 6, and up to PowerHA version

Topics: AIX, Monitoring, System Admin

NMON recordings

One can set up NMON recordings from smit via:

# smitty topas -> Start New Recording -> Start local recording -> nmon
However, the smit panel doesn't list the option needed to get disk IO service times. Specifically, the -d option to collect disk IO service and wait times. Thus, it's better to use the command line with the nmon command to collect and report these statistics. Here's one set of options for collecting the data:
# nmon -AdfKLMNOPVY^ -w 4 -s 300 -c 288 -m /var/adm/nmon
The key options here include:
  • -d Collect and report IO service time and wait time statistics.
  • -f Specifies that the output is in spreadsheet format. By default, the command takes 288 snapshots of system data with an interval of 300 seconds between each snapshot. The name of the output file is in the format of hostname_YYMMDD_HHMM.nmon.
  • -O Includes the Shared Ethernet adapter (SEA) VIOS sections in the recording file.
  • -V Includes the disk volume group section.
  • -^ Includes the FC adapter section (which also measures NPIV traffic on VIOS FC adapters).
  • -s Specifies the interval in seconds between 2 consecutive recording snapshots.
  • -c Specifies the number snapshots that must be taken by the command.
Running nmon using this command will ensure it runs for a full day. And it is therefore useful to start nmon daily using a crontab entry in the root crontab file. For example, using the following script:
# cat /usr/local/collect_nmon.ksh

PARAMS="-fTNAdKLMOPVY^ -w 4 -s 300 -c 288 -m $LOGDIR"

# LOGRET determines the number of days to retain nmon logs.

# Create the nmon folder.
if [ ! -d /var/adm/nmon ] ; then
        mkdir -p $LOGDIR

# Compress previous daily log.
find $LOGDIR -name *.nmon -type f -mtime +1 -exec gzip '{}' \;

# Clean up old logs.
find $LOGDIR -name *nmon.gz -type f -mtime +$LOGRET -exec rm '{}' \;

# Start nmon.
/usr/bin/nmon $PARAMS
Then add the following crontab entry to the root crontab file:
0 0 * * * /usr/local/collect_nmon.ksh >/tmp/collect_nmon.ksh.log 2>&1
To get the recordings thru the NMON Analyser tool (a spreadsheet tool that runs on PCs and generates performance graphs, other output, and is available here), it's recommended to keep the number of intervals less than 300.

Topics: AIX, Monitoring, System Admin

Boxes and lines in NMON

Usually, with the default settings used with NMON, along with using PuTTY on a Windows system, you may notice that the boxes and lines in NMON are not displayed correctly. It may look something like this:

An easy fix for this issue is to cahnge the character set translation within PuTTY. In the upper left corner of your PuTTY window, click the icon and select "Change Settings". Then navigate to Window -> Translation. In the "Remote character set" field, change "UTF-8" to "ISO-8859-1".

Once changed, restart PuTTY and it should something like this:

Another option is to stop using boxes and lines altogether. You can do this by starting nmon with the -B option:

# nmon -B
Or you can set the NMON environment variable to the same:
# export NMON=B
# nmon

Topics: AIX, Monitoring, Red Hat, Security, System Admin


Sudosh is designed specifically to be used in conjunction with sudo or by itself as a login shell. Sudosh allows the execution of a root or user shell with logging. Every command the user types within the root shell is logged as well as the output.

This is different from "sudo -s" or "sudo /bin/sh", because when you use one of these instead of sudosh to start a new shell, then this new shell does not log commands typed in the new shell to syslog; only the fact that a new shell started is logged.

If this newly started shell supports commandline history, then you can still find the commands called in the shell in a file such as .sh_history, but if you use a shell such as csh that does not support command-line logging you are out of luck.

Sudosh fills this gap. No matter what shell you use, all of the command lines are logged to syslog (including vi keystrokes). In fact, sudosh uses the script command to log all key strokes and output.

Setting up sudosh is fairly easy. For a Linux system, first download the RPM of sudosh, for example from Then install it on your Linux server:

# rpm -ihv sudosh-1.8.2-1.2.el4.rf.i386.rpm
Preparing...  ########################################### [100%]
   1:sudosh   ########################################### [100%]
Then, go to the /etc file system and open up /etc/sudosh.conf. Here you can adjust the default shell that is started, and the location of the log files. Default, the log directory is /var/log/sudosh. Make sure this directory exists on your server, or change it to another existing directory in the sudosh.conf file. This command will set the correct authorizations on the log directory:
# sudosh -i
[info]: chmod 0733 directory /var/log/sudosh
Then, if you want to assign a user sudosh access, edit the /etc/sudoers file by running visudo, and add the following line:
username ALL=PASSWD:/usr/bin/sudosh
Now, the user can login, and run the following command to gain root access:
$ sudo sudosh
# whoami
Now, as a sys admin, you can view the log files created in /var/log/sudosh, but it is much cooler to use the sudosh-replay command to replay (like a VCR) the actual session, as run by the user with the sudosh access.

First, run sudosh-replay without any paramaters, to get a list of sessions that took place using sudosh:
# sudosh-replay
Date       Duration From To   ID
====       ======== ==== ==   ==
09/16/2010 6s       root root root-root-1284653707-GCw26NSq

Usage: sudosh-replay ID [MULTIPLIER] [MAXWAIT]
See 'sudosh-replay -h' for more help.
Example: sudosh-replay root-root-1284653707-GCw26NSq 1 2
Now, you can actually replay the session, by (for example) running:
# sudosh-replay root-root-1284653707-GCw26NSq 1 5
The first paramtere is the session-ID, the second parameter is the multiplier. Use a higher value for multiplier to speed up the replay, while "1" is the actual speed. And the third parameter is the max-wait. Where there might have been wait times in the actual session, this parameter restricts to wait for a maximum max-wait seconds, in the example above, 5 seconds.

For AIX, you can find the necessary RPM here. It is slightly different, because it installs in /opt/freeware/bin, and also the sudosh.conf is located in this directory. Both Linux and AIX require of course sudo to be installed, before you can install and use sudosh.

Topics: AIX, Monitoring, System Admin

Cec Monitor

To monitor all lpars within 1 frame, use:

# topas -C

Topics: Monitoring, PowerHA / HACMP

HACMP auto-verification

HACMP automatically runs a verification every night, usually around mid-night. With a very simple command you can check the status of this verification run:

# tail -10 /var/hacmp/log/clutils.log 2>/dev/null|grep detected|tail -1
If this shows a returncode of 0, the cluster verification ran without any errors. Anything else, you'll have to investigate. You can use this command on all your HACMP clusters, allowing you to verify your HACMP cluster status every day.

With the following smitty menu you can change the time when the auto-verification runs and if it should produce debug output or not:
# smitty clautover.dialog
You can check with:
# odmget HACMPcluster
# odmget HACMPtimersvc
Be aware that if you change the runtime of the auto-verification that you have to synchronize the cluster afterwards to update the other nodes in the cluster.

Topics: Monitoring, PowerHA / HACMP

HACMP Event generation

HACMP provides events, which can be used to most accurately monitor the cluster status, for example via the Tivoli Enterprise Console. Each change in the cluster status is the result of an HACMP event. Each HACMP event has an accompanying notify method that can be used to handle the kind of notification we want.

Interesting Cluster Events to monitor are:

  • node_up
  • node_down
  • network_up
  • network_down
  • join_standby
  • fail_standby
  • swap_adapter
  • config_too_long
  • event_error
You can set the notify method via:
# smitty hacmp
Cluster Configuration
Cluster Resources
Cluster Events
Change/Show Cluster Events
You can also query the ODM:
# odmget HACMPevent

Topics: AIX, Monitoring, System Admin

"Bootpd: Received short packet" messages on console

If you're receiving messages like these on your console:

Mar 9 11:47:29 daemon:notice bootpd[192990]: received short packet
Mar 9 11:47:31 daemon:notice bootpd[192990]: received short packet
Mar 9 11:47:38 daemon:notice bootpd[192990]: hardware address not found: E41F132E3D6C
Then it means that you have the bootpd enabled on your server. There's nothing wrong with that. In fact, a NIM server for example requires you to have this enabled. However; these messages on the console can be annoying. There are systems on your network that are sending bootp requests (broadcast). Your system is listening to these requests and trying to answer. It is looking in the bootptab configuration (file /etc/bootptab) to see if their mac-addresses are defined. When they aren't, you are getting these messages.

To solve this, either disable the bootpd daemon, or change the syslog configuration. If you don't need the bootpd daemon, then edit the /etc/inetd.conf file and comment the entry for bootps. Then run:
# refresh -s inetd
If you do have a requirement for bootpd, then update the /etc/syslog.conf file and look for the entry that starts with daemon.notice:
#daemon.notice /dev/console
daemon.notice /nsr/logs/messages
By commenting the daemon.notice entry to /dev/console, and instead adding an entry that logs to a file, you can avoid seeing these messages on the console. Now all you have to do is refresh the syslogd daemon:
# refresh -s syslogd

Topics: AIX, Backup & restore, Monitoring, Red Hat, Spectrum Protect

Report the end result of a TSM backup

A very easy way of getting a report from a backup is by using the POSTSchedulecmd entry in the dsm.sys file. Add the following entry to your dsm.sys file (which is usually located in /usr/tivoli/tsm/client/ba/bin or /opt/tivoli/tsm/client/ba/bin):

POSTSchedulecmd "/usr/local/bin/RunTsmReport"
This entry tells the TSM client to run script /usr/local/bin/RunTSMReport, as soon as it has completed its scheduled command. Now all you need is a script that creates a report from the dsmsched.log file, the file that is written to by the TSM scheduler:
echo "TSM Report from `hostname`" >> ${WRKDIR}/tsmc
tail -100 ${TSMLOG} > ${WRKDIR}/tsma
grep -n "Elapsed processing time:" ${WRKDIR}/tsma > ${WRKDIR}/tsmb
CT2=`cat ${WRKDIR}/tsmb | awk -F":" '{print $1}'`
((CT3 = $CT2 - 14))
((CT5 = $CT2 + 1 ))
while read Line1 ; do
   if [ ${CT3} -gt ${CT4} ] ; then
      ((CT4 = ${CT4} + 1 ))
      echo "${Line1}" >> ${WRKDIR}/tsmc
      ((CT4 = ${CT4} + 1 ))
      if [ ${CT4} -gt ${CT5} ] ; then
done < ${WRKDIR}/tsma
mail -s "`hostname` Backup" < ${WRKDIR}/tsmc
rm ${WRKDIR}/tsma ${WRKDIR}/tsmb ${WRKDIR}/tsmc

Number of results found for topic Monitoring: 13.
Displaying results: 1 - 10.