UNIX Health Check - System Admin

Generating random passwords

When you set up a new user account, and assign a password to that account, you'll want to make sure that it is a password that can not be easily guessed. Setting the initial password to something easy like "changeme", only allows hackers easy access to your system.

So the best way you can do this, is by generating a fully random password. That can easily be achieved by using the /dev/urandom device.

Here's an easy command to generate a random password:

# dd if=/dev/urandom bs=16 count=1 2>/dev/null | openssl base64 | sed "s/[=O/\]//g" | cut -b1-8

This will create passwords like:

ej9yTaaD
Ux9FYusx
QR0TSAZC
...

Topics: AIX, Security, System Admin ↑

Fix user accounts

Security guidelines nowadays can be annoying. Within many companies people have to comply with strict security in regards to password expiration settings, password complexity and system security settings. All these settings and regulations more than often result in people getting locked out from their accounts on AIX systems, and also getting frustrated at the same time.

To help your users, you can't go change default security settings on the AIX systems. Your auditor will make sure you won't do that. But instead, there are some "tricks" you can use, to ensure that a user account is (and stays) available to your end user. We've put all those tricks together in one simple script, that can fix a user account, and we called it fixuser.ksh. It will fix 99% of all user related login issues.

You can run this script as often as you like and for any user that you like. It will help you to ensure that a user account is not locked, that AIX won't bug the user to change their password, that the user doesn't have a failed login count (from typing too many passwords), and a bunch of other stuff that usually will keep your users from logging in and getting pesky "Access Denied" messages.

The script will not alter any default security settings, and it can easily be adjusted to run for several user accounts, or can be run from a crontab so user accounts stay enabled for your users. The script is a win-win situation for everyone: Your auditor is happy, because security settings are strict on your system; Your users are happy for being able to just login without any hassle; And the sys admin will be happy for not having to resolve login issues manually anymore.

The script can be run by entering a specific user account:

# fixuser.ksh username

The script:

#!/usr/bin/ksh

fixit()
{

  myid=${1}

  # Unlock account
  printf "Unlocking account for ${user}..."
  chuser account_locked=false ${user}
  echo " Done."

  # Reset failed login count
  printf "Reset failed login count for ${user}..."
  chuser unsuccessful_login_count=0 ${user}
  echo " Done."

  # Reset expiration date
  printf "Reset expiration date for ${user}..."
  chuser expires=0 ${user}
  echo " Done."

  # Allow the user to login
  printf "Enable login for ${user}..."
  chuser login=true ${user}
  echo " Done."

  # Allow the user to login remotely
  printf "Enable remote login for ${user}..."
  chuser rlogin=true ${user}
  echo " Done."

  # Reset maxage
  printf "Reset the maxage for ${user}..."
  m=`lssec -f /etc/security/user -s default -a maxage | cut -f2 -d=`
  chuser maxage=${m} ${user}
  echo " Done."

  # Clear password change requirement
  printf "Clear password change requirement for ${user}..."
  pwdadm -c ${user}
  echo " Done."

  # Reset password last update
  printf "Reset the password last update for ${user}..."
  let sinceepoch=`perl -e 'printf(time)' | awk '{print $1}'`
  n=`lssec -f /etc/security/user -s default -a minage | cut -f2 -d=`
  let myminsecs="${n}*7*24*60*60"
  let myminsecs="${myminsecs}+1000"
  let newdate="${sinceepoch}-${myminsecs}"
  chsec -f /etc/security/passwd -s ${user} -a lastupdate=${newdate}
  echo " Done."
}

unset user

if [ ! -z "${1}" ] ; then
  user=${1}
fi

# If a username is provided, fix that user account

unset myid
myid=`id ${user} 2>/dev/null`
if [ ! -z "${myid}" ] ; then
  echo "Fixing account ${user}..."
  fixit ${user}
  printf "Remove password history..."
  cp /dev/null /etc/security/pwdhist.pag 2>/dev/null
  cp /dev/null /etc/security/pwdhist.dir 2>/dev/null
  echo " Done."
else
  echo "User ${user} does not exist."
fi

Topics: AIX, Security, System Admin ↑

Clearing password history

Sometimes when password rules are very strict, a user may have problems creating a new password that is both easy to remember, and still adheres to the password rules. To aid the user, it could be useful to clear the password history for his or her account, so he or she can re-use a certain password that has been used in the past. The password history is stored in /etc/security/pwdhist.pag and /etc/security/pwdhist.dir. The command you can use to disable the password history for a user is:

# chuser histsize=0 username

Actually, this command does not the password history in /etc/security/pwdhist.dir and /etc/security/pwdhist.pag, but only changes the setting of histsize for the account to zero, meaning, that a user is not checked again on re-using old passwords. After the user has changed his or her password, you may want to set it back again to the default value:

# grep -p ^default /etc/security/user | grep histsize
        histsize = 20
# chuser histsize=20 username

In older AIX levels, this functionality (to use chuser histsize=0) would actually have cleared out the password history of the user. In later AIX levels, this functionality has vanished.

So, if you truely wish to delete the password history for a user, here's another way to clear the password history on a system: It is accomplished by zeroing out the pwdhist.pag and pwdhist.dir files. However, this results in the deletion of all password history for all users on the system:

# cp /dev/null /etc/security/pwdhist.pag
# cp /dev/null /etc/security/pwdhist.dir

Please note that his is a temporary measure. Once these files are zeroed out, as soon as a user changes his or her password again, the old password is stored again in these files and it can't be reused (unless the histsize attribute for a user is set to 0).

Topics: AIX, Monitoring, System Admin ↑

Boxes and lines in NMON

Usually, with the default settings used with NMON, along with using PuTTY on a Windows system, you may notice that the boxes and lines in NMON are not displayed correctly. It may look something like this:

An easy fix for this issue is to change the character set translation within PuTTY. In the upper left corner of your PuTTY window, click the icon and select "Change Settings". Then navigate to Window -> Translation. In the "Remote character set" field, change "UTF-8" to "ISO-8859-1".

Once changed, restart PuTTY and it should something like this:

Another option is to stop using boxes and lines altogether. You can do this by starting nmon with the -B option:

# nmon -B

Or you can set the NMON environment variable to the same:

# export NMON=B
# nmon

Topics: Red Hat / Linux, System Admin ↑

Adding swap space to RHEL

Here's a procedure how you can add additional swap space to a running RHEL system.

This procedure assumes you will want to add 8 Gigabytes of swap space, and we will be using LVM to do so. To get information from Red Hat on recommended swap space sizes, take a look here: https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/5/html/Deployment_Guide/ch-swapspace.html.

First start by checking what the current swap space size is, by using the free command:

# free -m -t
             total       used       free     shared    buffers     cached
Mem:        129013     124325       4688          9        173      97460
-/+ buffers/cache:      26691     102322
Swap:        16383       8057       8326
Total:      145397     132382      13015

This particular system has 16 GB of swap space (look in the "total" column next to "Swap:"). Using the -m option with the free command displays the memory values in megabytes. Using the -t option will provide the totals.

You can also see that the system has used 8057 MB of it's swap space, almost half of the swap space available.

Then, figure out how the current swap spaces are configured now:

# cat /proc/swaps
Filename                         Type            Size    Used    Priority
/dev/dm-1                        partition       8388604 8262740 -1
/dev/dm-8                        partition       8388604 0       -2

This shows that there are 2 paging spaces of 8 GB each. To increase the swap space on the system, we'll add another swap space of 8 GB, so the total swap space will go up to 24 GB.

To get a view of what logical volumes exist on the system, use the dmsetup command:

# dmsetup ls
rootvg00-optlv00        (253:7)
rootvg00-tmplv00        (253:3)
rootvg00-varlv00        (253:2)
rootvg00-homelv00       (253:6)
rootvg00-rootlv00       (253:0)
rootvg00-usrlocallv00   (253:5)
rootvg00-swaplv01       (253:8)
rootvg00-usrlv00        (253:4)
rootvg00-swaplv00       (253:1)

This shows that there are 2 logical volumes, swaplv00, and swaplv01. We'll create swaplv02 as the third swap space on the system.

Another good way to see the same information, is by using the lvs command:

# lvs 2>/dev/null
  LV           VG       Attr       LSize
  homelv00     rootvg00 -wi-ao---- 10.00g
  optlv00      rootvg00 -wi-ao----  8.00g
  rootlv00     rootvg00 -wi-ao----  2.00g
  swaplv00     rootvg00 -wi-ao----  8.00g
  swaplv01     rootvg00 -wi-ao----  8.00g
  tmplv00      rootvg00 -wi-ao----  5.00g
  usrlocallv00 rootvg00 -wi-ao----  1.00g
  usrlv00      rootvg00 -wi-ao----  5.00g
  varlv00      rootvg00 -wi-ao----  4.00g

This gives you the information that the logical volumes have been created in the rootvg00 volume group. We'll create the new swap space in the same volume group, using the lvcreate command:

# lvcreate -n swaplv02 -L 8G rootvg00
  Logical volume "swaplv02" created

Using the -n option of the lvcreate command, you can specify the name of the logical volume. The -L option specifies the size (in this case 8G), and you end the command with the volume group name.

Next, you'll have to tell RHEL that the new logical volume is to be formatted for swap space usage:

# mkswap /dev/rootvg00/swaplv02
Setting up swapspace version 1, size = 8388604 KiB
no label, UUID=c9be43f7-c473-45ae-ba13-c1e09af2d95e

Then, you'll have to add an entry to /etc/fstab, so the system knows to re-use the swap space after a system reboot:

# grep swap /etc/fstab
/dev/mapper/rootvg00-swaplv00 swap     swap    defaults        0 0
/dev/mapper/rootvg00-swaplv01 swap     swap    defaults        0 0
/dev/mapper/rootvg00-swaplv02 swap     swap    defaults        0 0

Finally, activate the new swap space using the swapon command:

# swapon -v /dev/rootvg00/swaplv02 swapon on /dev/rootvg00/swaplv02 swapon: /dev/mapper/rootvg00-swaplv02: found swap signature: version 1, page-size 4, same byte order swapon: /dev/mapper/rootvg00-swaplv02: pagesize=4096, swapsize=8589934592, devsize=8589934592

To validate that the new swap space is available on the system, use the free command again, and you may also review /proc/swaps:

# free -m -t
             total       used       free     shared    buffers     cached
Mem:        129013     121344       7669          9        175      95575
-/+ buffers/cache:      25593     103420
Swap:        24575       8109      16466
Total:      153589     129453      24136
# cat /proc/swaps
Filename                         Type            Size    Used    Priority
/dev/dm-1                        partition       8388604 8303856 -1
/dev/dm-8                        partition       8388604 0       -2
/dev/dm-9                        partition       8388604 0       -3

That's it; you're done!

Topics: AIX, System Admin ↑

The chrctcp command

The chrctcp command in not documented in AIX, but you can still use it to do nice things, especially when you are scripting. Some examples are:

To enable xntpd in /etc/rc.tcpip, and to start xntpd:

# chrctcp -S -a xntpd

To disable xntpd in /etc/rc.tcpip, and to stop xntpd:

# chrctcp -S -d xntpd

To enable xntpd in /etc/rc.tcpip, but not start xntpd:

# chrctcp -a xntpd

To disable xntpd in /etc/rc.tcpip, but to not stop xntpd:

# chrctcp -d xntpd

So, instead of manually editing /etc/rc.tcpip, you can use chrctcp to enable (uncomment), disable (comment) some services, and start and stop them in a single command.

Topics: AIX, Networking, System Admin ↑

Using tcpdump to discover network information

As an AIX admin, you may not always know what switches a certain server is connected to. If you have Cisco switches, here's an interesting method to identify the switch your server is connected to.

First, run ifconfig to look up the interfaces that are in use:

# ifconfig -a | grep en | grep UP | cut -f1 -d:
en0
en4
en8

Okay, so on this system, you have interfaces en0, en4 and en8 active. So, if you want to determine the switch en4 is connected to, run this command:

# tcpdump -nn -v -i en4 -s 1500 -c 1 'ether[20:2] == 0x2000'
tcpdump: listening on en4, link-type 1, capture size 1500 bytes

After a while, it will display the following information:

11:40:14.176810 CDP v2, ttl: 180s, checksum: 692 (unverified)
   Device-ID (0x01), length: 22 bytes: 'switch1.host.com'
   Version String (0x05), length: 263 bytes:
   Cisco IOS Software, Catalyst 4500 L3 Switch Software 
      (cat4500e-IPBASEK9-M), Version 12.2(52)XO, RELEASE SOFTWARE
   Technical Support: http://www.cisco.com/techsupport
   Copyright (c) 1986-2009 by Cisco Systems, Inc.
   Compiled Sun 17-May-09 18:51 by prod_rel_team
   Platform (0x06), length: 16 bytes: 'cisco WS-C4506-E'
   Address (0x02), length: 13 bytes: IPv4 (1) 111.22.33.44
   Port-ID (0x03), length: 18 bytes: 'GigabitEthernet2/7'
   Capability (0x04), length: 4 bytes: (0x00000029): 
      Router, L2 Switch, IGMP snooping
   VTP Management Domain (0x09), length: 2 bytes: ''''
   Native VLAN ID (0x0a), length: 2 bytes: 970
   Duplex (0x0b), length: 1 byte: full
   Management Addresses (0x16), length: 13 bytes: IPv4 (1) 
      111.22.33.44
   unknown field type (0x1a), length: 12 bytes:
      0x0000:  0000 0001 0000 0000 ffff ffff
47 packets received by filter
0 packets dropped by kernel

Note here that this will only work on Cisco switches, as it uses the Cisco Discovery Protocol (CDP).

The output above will help you determine, that en4 is connected to a network switch called 'switch1.host.com', with IP address '111.22.33.44', and that it is connected to port 'GigabitEthernet2/7' (most likely port 7 on blade 2 of this switch).

If you're running the same command on an Etherchannelled interface, keep in mind that it will only display the information of the active interface in the Etherchannel configuration. You may have to fail over the Etherchannel to a backup adapter, to determine the switch information for the backup adapter in the Etherchannel configuration.

If your LPAR has virtual Ethernet adapters, this will not work (the command will just hang). Instead, run the command on the VIOS instead.

Also note that you may need to run the command a couple of times, for tcpdump to discover the necessary information.

Another interesting way to use tcpdump is to discover what VLAN an network interface is connected to. For example, if you have 2 interfaces on an AIX system, and you would want to configure them in an Etherchannel, or you would want to use one of them as a production interface, and another as a standby interface. In that case, it is important to know that both interfaces are within the same VLAN. Obviously, you can ask your network team to validate, but it is also good to be able to validate on the host side. Also, you can just configure an IP address on it, and see if it will work. But for production systems, that may not always be possible.

The trick basically is, to run tcpdump on an interface, and check what network traffic can be discovered. For example, if you have 2 network interfaces, like these:

# netstat -ni | grep en[0,1]
en0 1500 link#2    0.21.5e.c0.d0.12 1426632806  0 86513680  0  0
en0 1500 10.27.18  10.27.18.64      1426632806  0 86513680  0  0
en1 1500 link#3    0.21.5e.c0.d0.13   20198022  0  7426576  0  0
en1 1500 10.27.130 10.27.130.10       20198022  0  7426576  0  0

In this case, interface en0 uses IP address 10.27.18.64, and is within the 10.27.18.x subnet. Interface en1 uses IP address 10.27.130.10, and is within the 10.27.130.x subnet (assuming both interfaces use a subnet mask of 255.255.255.0).

Now, if en0 is a production interface, and you would like to confirm that en1, the standby interface, can be used to fail over the production interface to, then you need to know that both of the interfaces are within the same VLAN. To determine that, for en1, run tcpdump, and check if any network traffic in the 10.27.18 subnet (used by en0) can be seen (press CTRL-C after seeing any such network traffic, to cancel the tcpdump command):

# tcpdump -i en1 -qn net 10.27.18
tcpdump: verbose output suppressed, 
use -v or -vv for full protocol decode
listening on en1, link-type 1, capture size 96 bytes
07:27:25.842887 ARP, Request who-has 10.27.18.136
   (ff:ff:ff:ff:ff:ff) tell 10.27.18.2, length 46
07:27:25.846134 ARP, Request who-has 10.27.18.135 
   (ff:ff:ff:ff:ff:ff) tell 10.27.18.2, length 46
07:27:25.917068 IP 10.27.18.2.1985 > 224.0.0.2.1985: UDP, length 20
07:27:25.931376 IP 10.27.18.3.1985 > 224.0.0.2.1985: UDP, length 20
^C
24 packets received by filter
0 packets dropped by kernel

After seeing this, you know for sure that on interface en1, even though it has an IP address in subnet 10.27.130.x, network traffic for 10.27.18.x subnet can be seen, and thus that failing over the production interface IP address from en0 to en1 should work just fine.

Topics: AIX, System Admin ↑

Removing files by inode

You will encounter them from time to time: files with weird filenames, such as spaces, escape codes, or uncommon characters. These often can be very difficult to remove.

For example, files with a space at the end:

# touch "a file "
# ls
a file

It's not such a problem, if you created the file yourself, and you KNOW there is space at the end. Otherwise, it can be quite difficult to remove it:

# rm "a file"
rm: a file: A file or directory in the path name does not exist.

It can be even more ugly if there is a ^M in the file name:

# touch 'a^Mfile'
# ls a*
a
file
# ls file
ls: 0653-341 The file file does not exist.

And it will quickly become horrible if there are unprintable characters in file names, or a combination of all of the above. Or how about a file called "-rf /". Would do dare run the command: "rm -rf /" on your system, not knowing if this will wipe out all files, or just remove the file with the filename "-rf /"?

So, if you have a file with an awkward filename, or simply don't know the file name of a file, because it contains unprintable characters, escape codes, slashes, spaces or tabs, how do you safely remove it?

Well, you can remove files by inode. First, discover the inode of a file:

# ls -alsi
12294  0 -rw-r--r--  1 root  system  0 May 07 15:38 a file

In the example above, the inode has number 12294. Then simply remove it using the find command, and the -exec option for the find command:

# find . -inum 12294 -ls
12294  0 -rw-r--r--  1 root  system  0 May  7 15:38 ./a file
# find . -inum 12294 -exec rm {} \;

Topics: AIX, System Admin ↑

Collecting core dumps

A core dump is the process by which the current state of a program is preserved in a file before a program is ended because of an unexpected error.

Core dumps are usually associated with programs that have encountered an unexpected, system-detected fault, such as a segmentation fault or a severe user error. An application programmer can use the core dump to diagnose and correct the problem. The core files are binary files that are specific to the AIX operating system.

To generate a core file of a running program, you can use the gencore command. Before you do so, make sure that the system is set to allow applications to generate full core files. By default this will be disabled, to avoid applications quickly filling up file systems.

# lsattr -E -l sys0 -a fullcore
fullcore false Enable full CORE dump True
# chdev -l sys0 -a fullcore=true
sys0 changed

Also check your umlimts, to ensure that the user is set to allow large files to be generated. And check the available space in the file system where you want to write the core file.

Next, generate the core file of a running program, for example of a process with ID 65274068. Note that the gencore command creates a core file without terminating the process.

# gencore 65274068 /tmp/core_65274068

Once the core file has been generated, be sure to set fullcore back to false:

# chdev -l sys0 -a fullcore=false
sys0 changed
# lsattr -E -l sys0 -a fullcore
fullcore false Enable full CORE dump True

Now you can use the snapcore command to gather the core file, the program, and any libraries used by the program into one pax archive, which can be sent to a vendor for further analysis. Using the -d option of the snapcore command you specify where the archive will be written.

# file core_65274068
core_65274068: AIX core file fulldump 64-bit, user
# snapcore -d /tmp core_65274068 /path/to/the/program

Core file "core_65274068" created by "user"

pass1() in progress ....
                Calculating space required .
                Total space required is 4605936 kbytes ..
                Checking for available space ...
                Available space is 33787748 kbytes
pass1 complete.

pass2() in progress ....
                Collecting fileset information .
                Collecting error report of CORE_DUMP errors ..
                Creating readme file ..
                Creating archive file ...
                Compressing archive file ....
pass2 completed.

Snapcore completed successfully. Archive created in /tmp.

Check the file:

# ls -l 
-rw-rw-rw- 1 root system 12183573 Mar 22 08:50 core_65274068 
-rw-r--r-- 1 root system 12594032 Mar 22 08:50 snapcore_663646.pax.Z 
# file snapcore_663646.pax.Z 
snapcore_663646.pax.Z: compressed data block compressed 16 bit

The resulting snapcore file can then be sent to Technical Support. It can then be uncompressed and untarred (tar can work on pax images).

Topics: AIX, System Admin ↑

Storing core files in one location

Core files have the habit to be scattered all over the server, depending on what processes are running, what the working directories are of processes, and which of the processes dump a core file. That is often very annoying, and you may have to use the find command to find all the core files to clean them up.

There is a way to create a centralized repository for your core files, and you can use some not so very well known user settings to do just that.

First, create a location where you can store core files, for example, create a file system /corefiles with plenty of space:

# df -g /corefiles
Filesystem    GB blocks      Free %Used    Iused %Iused Mounted on
/dev/corefilelv     19.92      8.56   58%       94     1% /corefiles

Now, change the default core settings to point to this location:

# chsec -f /etc/security/user -s default -a core_path=on -a core_pathname=/corefiles -a core_compress=on -a core_naming=on

The command above changes the default settings for the core files - for all users. You can obviously do the same for individual users instead, just change "default" to whatever user you want to set this for.

The four options in the chsec command above are:

core_compress

Enables or disables core file compression. Valid values for this attribute are On and Off. If this attribute has a value of On, compression is enabled; otherwise, compression is disabled. The default value of this attribute is Off. This will help you save disk space.

core_path

Enables or disables core file path specification. Valid values for this attribute are On and Off. If this attribute has a value of On, core files will be placed in the directory specified by core_pathname (the feature is enabled); otherwise, core files are placed in the user's current working directory. The default value of this attribute is Off. You'll need to set this if you wish to specify a specific directory to store core files.

core_pathname

Specifies a location to be used to place core files, if the core_path attribute is set to On. If this is not set and core_path is set to On, core files will be placed in the user's current working directory. This attribute is limited to 256 characters. This is where you specifiy the directory to store core files.

core_naming

Selects a choice of core file naming strategies. Valid values for this attribute are On and Off. A value of On enables core file naming in the form core.pid.time, which is the same as what the CORE_NAMING environment variable does. A value of Off uses the default name of core. This will create core files with the name in a form of core.pid.time, where time is ddhhmmss (pid = the process id; dd = day of the month, hh = hours, mm = mintues, ss = seconds). You can leave out this option and instead set environment variable CORE_NAMING to true.

Doing so - and after restarting any applications (or the whole server), your core files should now be all stored in /corefiles - that is, if you have any processes that generate core files of course.

Note: The same can be achieved with the chcore command:

# chcore -c on -p on -l /corefiles -n on -d

Validate the settings as follows:

# grep -p default /etc/security/user | grep core
        core_compress = on
        core_path = on
        core_naming = on
        core_pathname = /corefiles

Number of results found for topic System Admin: 249.
Displaying results: 51 - 60.

Order

No time to lose? Need to know what's wrong with
your UNIX system now? Then get started TODAY!