If any of your cron jobs are running late, or not at all, check the cron log (/var/adm/cron/log) to see if there are any errors or other messages around the time the jobs should run.
If you see messages like this:
! c queue max run limit reached Fri Sep 20 13:15:00 2013
! rescheduling a cron job Fri Sep 20 13:15:00 2013
The reason the jobs are not running is that there are too many simultaneous jobs at the time the daemon tries to run a new job.
The cron daemon has a limit of how many jobs it will run simultaneously. By default it is 100 jobs. If a new job is scheduled to run and the limit has already been reached the job will be rescheduled at a later time (the default is 60 seconds later). Both the number of jobs and wait time are configured in the file /var/adm/cron/queuedefs.
If it is unusual for cron to be running so many jobs, you can check the process table to view the jobs cron has created. These jobs will have parent process id (PPID) of the cron daemon.
# ps -ef | grep cron | grep -v grep
root 2097204 1 0 Dec 02 - 0:33 /usr/sbin/cron
# ps -T 2097204
PID TTY TIME CMD
2097204 - 0:33 cron
17760598 - 0:00 \--ksh
18153488 - 0:16 \--find
In the example above the cron daemon has 1 child job, which is a shell, and that shell (possibly running a script) is running the "find" command. This would count as 1 direct descendent job from cron.
If you find many of the same job stuck there may be a problem with the script or command being run. The command or script should be checked from a shell prompt to see if it completes successfully.
If the large number of jobs are naturally occurring as a result of increased workload on the system, you may need to change the values in the queuedefs file and increase them from their defaults.
To do this, add an entry to the bottom of the queuedefs file using an editor such as vi. The entry should have the form:
c.50j20n60w
Where:
c = The "c" or cron queue
Nj = The maximum number of jobs to be run simultaneously by cron.
Nn = The "nice" value of the jobs to be run (default is 2).
Nw = The time a job has to wait until the next attempt to run it.
For example:
c200j2n60w
This example would set the cron queue to a maximum of 200 jobs, with a nice value of 2, and a wait time of 60 seconds.
It is not necessary to restart cron after modifying the queuedefs file, it will be automatically checked by cron's event loop.
Source: IBM Technote
http://www-01.ibm.com/support/docview.wss?uid=isg3T1020382
A very useful hard reset tool!One can set up NMON recordings from smit via:
# smitty topas -> Start New Recording -> Start local recording -> nmon
However, the smit panel doesn't list the option needed to get disk IO service times. Specifically, the -d option to collect disk IO service and wait times. Thus, it's better to use the command line with the nmon command to collect and report these statistics. Here's one set of options for collecting the data:
# nmon -AdfKLMNOPVY^ -w 4 -s 300 -c 288 -m /var/adm/nmon
The key options here include:
- -d Collect and report IO service time and wait time statistics.
- -f Specifies that the output is in spreadsheet format. By default, the command takes 288 snapshots of system data with an interval of 300 seconds between each snapshot. The name of the output file is in the format of hostname_YYMMDD_HHMM.nmon.
- -O Includes the Shared Ethernet adapter (SEA) VIOS sections in the recording file.
- -V Includes the disk volume group section.
- -^ Includes the FC adapter section (which also measures NPIV traffic on VIOS FC adapters).
- -s Specifies the interval in seconds between 2 consecutive recording snapshots.
- -c Specifies the number snapshots that must be taken by the command.
Running nmon using this command will ensure it runs for a full day. And it is therefore useful to start nmon daily using a crontab entry in the root crontab file. For example, using the following script:
# cat /usr/local/collect_nmon.ksh
#!/bin/ksh
LOGDIR="/var/adm/nmon"
PARAMS="-fTNAdKLMOPVY^ -w 4 -s 300 -c 288 -m $LOGDIR"
# LOGRET determines the number of days to retain nmon logs.
LOGRET=365
# Create the nmon folder.
if [ ! -d /var/adm/nmon ] ; then
mkdir -p $LOGDIR
fi
# Compress previous daily log.
find $LOGDIR -name *.nmon -type f -mtime +1 -exec gzip '{}' \;
# Clean up old logs.
find $LOGDIR -name *nmon.gz -type f -mtime +$LOGRET -exec rm '{}' \;
# Start nmon.
/usr/bin/nmon $PARAMS
Then add the following crontab entry to the root crontab file:
0 0 * * * /usr/local/collect_nmon.ksh >/tmp/collect_nmon.ksh.log 2>&1
To get the recordings thru the NMON Analyser tool (a spreadsheet tool that runs on PCs and generates performance graphs, other output, and is available
here), it's recommended to keep the number of intervals less than 300.
By default, when using sudo, the env_reset sudo option is enabled.
From the sudoers manual, about the env_reset sudo option:
This causes commands to be executed with a new, minimal environment. On AIX the environment is initialized with the contents of the /etc/environment file. The new environment contains the TERM, PATH, HOME, MAIL, SHELL, LOGNAME, USER, USERNAME and SUDO_* variables in addition to variables from the invoking process permitted by the env_check and env_keep options. This is effectively a whitelist for environment variables.
If, however, the env_reset option is disabled, any variables not explicitly denied by the env_check and env_delete options are inherited from the invoking process. In this case, env_check and env_delete behave like a blacklist. Since it is not possible to blacklist all potentially dangerous environment variables, use of the default env_reset behavior is encouraged.
In all cases, environment variables with a value beginning with () are removed as they could be interpreted as bash functions. The list of environment variables that sudo allows or denies is contained in the output of "sudo -V" when run as root.
So, what does this all mean? Well, it means that you should not use env_reset in the /etc/sudoers file.
First of all, if you would use:
Defaults env_reset
Then that would do you no good, because the default is already to reset the environment variables.
If you would use (notice the exclamation mark before env_reset):
Defaults !env_reset
Then it means you don't reset any environment variables from the invoking process, for ALL users. That is a security risk, as sudo will preserve variables such as PATH or LD_LIBRARY, and these variables can be configured with values such as "." or "/home/username", or they can be utilized by malicious software.
With the default env_reset all sudo sessions will invoke a shell with minimum shell variables, including those set in /etc/profile and some others if specified in sudoers file (using the env_keep option). So this will make a more controlled sudo access without bypassing sudo security restrictions.
Okay, so what if you need to run a command through sudo that requires a certain environment variable? A good example is the tcpdump command. When running tcpdump via sudo, you may encounter the following error message:
$ sudo tcpdump -i en12
tcpdump: bpf_load: genmajor failed: A file or directory in the path name does not exist.
In this case, tcpdump is known to require the ODMDIR environment variable to be set. One way is to use "Defaults !env_reset" in /etc/sudoers, but the sudoers manual above explains that this is discouraged. Another method is to allow only specific users in /etc/sudoers, by disabling env_reset, such as:
User_Alias UTCPDUMP = tim, john
Defaults:UTCPDUMP !env_reset
But this still allows specific users to "play" with all environment variables. So unless you trust these users very much, an even better way is to use the env_keep sudo option, to specify the environment variables that need not be reset (that is, if you know the correct environment variables that are required). In the case of the tcpdump command, we will want to retain the ODMDIR environment variable:
Defaults env_keep += ODMDIR
With the above line in /etc/sudoers, you will notice that running the tcpdump command via sudo will now work properly.
So, the bottom line is: Don't use env_reset at all in /etc/sudoers. If really necessary, use env_reset for only specific users, or even better, specify the required environment variables using env_keep.
Of course, the UNIX Health Check software will check if env_reset is used in /etc/sudoers, and if so, warn about this potential security risk.
This is probably one of things that people mess up all the time. They both have to do with permissions on a file, but the SUID/GUID (or SETUID short for set-user-id/SETGID short for set-group-id) bit and the sticky-bit are 2 completely different things.
The SUID/GUID
The letters rwxXst select file mode bits for users:
- read (r)
- write (w)
- execute (or search for directories) (x)
- execute/search only if the file is a directory or already has execute permission for some user (X)
- set user or group ID on execution (s)
- restricted deletion flag or sticky bit (t)
The position that the x bit takes in rwxrwxrwx for the user octet (1st group of rwx) and the group octet (2nd group of rwx) can take an additional state where the x becomes an s. When this file when executed (if it's a program and not just a shell script), it will run with the permissions of the owner or the group of the file. That is called the SUID, when set for the user octet, and GUID, when set for the group octet.
So if the file is owned by root and the SUID bit is turned on, the program will run as root. Even if you execute it. The same thing applies to the GUID bit. You can set or clear the bits with symbolic modes like u+s and g-s, and you can set (but not clear) the bits with a numeric mode.
SUID/GUID examples
No SUID/GUID: Just the bits rwxr-xr-x are set:
# ls -lt test.pl
-rwxr-xr-x 1 root root 179 Jan 9 01:01 test.pl
SUID and user's executable bit enabled (lowercase s): The bits rwsr-x-r-x are set.
# chmod u+s test.pl
# ls -lt test.pl
-rwsr-xr-x 1 root root 179 Jan 9 01:01 test.pl
SUID enabled and executable bit disabled (uppercase S): The bits rwSr-xr-x are set.
# chmod u-x test.pl
# ls -lt test.pl
-rwSr-xr-x 1 root root 179 Jan 9 01:01 test.pl
GUID and group's executable bit enabled (lowercase s): The bits rwxr-sr-x are set.
# chmod g+s test.pl
# ls -lt test.pl
-rwxr-sr-x 1 root root 179 Jan 9 01:01 test.pl
GUID enabled and executable bit disabled (uppercase S): The bits rwxr-Sr-x are set.
# chmod g-x test.pl
# ls -lt test.pl
-rwxr-Sr-x 1 root root 179 Jan 9 01:01 test.pl
The sticky bit
The sticky bit on the other hand is denoted as a t, such as with the /tmp or /var/tmp directories:
# ls -ald /tmp
drwxrwxrwt 36 bin bin 8192 Nov 27 08:40 /tmp
# ls -ald /var/tmp
drwxrwxrwt 3 bin bin 256 Nov 27 08:28 /var/tmp
This bit should have always been called the "restricted deletion bit" given that's what it really denotes. When this mode bit is enabled, it makes a directory such that users can only delete files and directories within it that they are the owners of.
For regular files the bit was used to save the program in swap device so that the program would load more quickly when run; this is called the sticky bit, but it's not used anymore in AIX.
More information can be found in the manual page of the chmod command or on
http://en.wikipedia.org/wiki/Sticky_bit.
When you set up a new user account, and assign a password to that account, you'll want to make sure that it is a password that can not be easily guessed. Setting the initial password to something easy like "changeme", only allows hackers easy access to your system.
So the best way you can do this, is by generating a fully random password. That can easily be achieved by using the /dev/urandom device.
Here's an easy command to generate a random password:
# dd if=/dev/urandom bs=16 count=1 2>/dev/null | openssl base64 | sed "s/[=O/\]//g" | cut -b1-8
This will create passwords like:
ej9yTaaD
Ux9FYusx
QR0TSAZC
...
Security guidelines nowadays can be annoying. Within many companies people have to comply with strict security in regards to password expiration settings, password complexity and system security settings. All these settings and regulations more than often result in people getting locked out from their accounts on AIX systems, and also getting frustrated at the same time.
To help your users, you can't go change default security settings on the AIX systems. Your auditor will make sure you won't do that. But instead, there are some "tricks" you can use, to ensure that a user account is (and stays) available to your end user. We've put all those tricks together in one simple script, that can fix a user account, and we called it fixuser.ksh. It will fix 99% of all user related login issues.
You can run this script as often as you like and for any user that you like. It will help you to ensure that a user account is not locked, that AIX won't bug the user to change their password, that the user doesn't have a failed login count (from typing too many passwords), and a bunch of other stuff that usually will keep your users from logging in and getting pesky "Access Denied" messages.
The script will not alter any default security settings, and it can easily be adjusted to run for several user accounts, or can be run from a crontab so user accounts stay enabled for your users. The script is a win-win situation for everyone: Your auditor is happy, because security settings are strict on your system; Your users are happy for being able to just login without any hassle; And the sys admin will be happy for not having to resolve login issues manually anymore.
The script can be run by entering a specific user account:
# fixuser.ksh username
The script:
#!/usr/bin/ksh
fixit()
{
myid=${1}
# Unlock account
printf "Unlocking account for ${user}..."
chuser account_locked=false ${user}
echo " Done."
# Reset failed login count
printf "Reset failed login count for ${user}..."
chuser unsuccessful_login_count=0 ${user}
echo " Done."
# Reset expiration date
printf "Reset expiration date for ${user}..."
chuser expires=0 ${user}
echo " Done."
# Allow the user to login
printf "Enable login for ${user}..."
chuser login=true ${user}
echo " Done."
# Allow the user to login remotely
printf "Enable remote login for ${user}..."
chuser rlogin=true ${user}
echo " Done."
# Reset maxage
printf "Reset the maxage for ${user}..."
m=`lssec -f /etc/security/user -s default -a maxage | cut -f2 -d=`
chuser maxage=${m} ${user}
echo " Done."
# Clear password change requirement
printf "Clear password change requirement for ${user}..."
pwdadm -c ${user}
echo " Done."
# Reset password last update
printf "Reset the password last update for ${user}..."
let sinceepoch=`perl -e 'printf(time)' | awk '{print $1}'`
n=`lssec -f /etc/security/user -s default -a minage | cut -f2 -d=`
let myminsecs="${n}*7*24*60*60"
let myminsecs="${myminsecs}+1000"
let newdate="${sinceepoch}-${myminsecs}"
chsec -f /etc/security/passwd -s ${user} -a lastupdate=${newdate}
echo " Done."
}
unset user
if [ ! -z "${1}" ] ; then
user=${1}
fi
# If a username is provided, fix that user account
unset myid
myid=`id ${user} 2>/dev/null`
if [ ! -z "${myid}" ] ; then
echo "Fixing account ${user}..."
fixit ${user}
printf "Remove password history..."
cp /dev/null /etc/security/pwdhist.pag 2>/dev/null
cp /dev/null /etc/security/pwdhist.dir 2>/dev/null
echo " Done."
else
echo "User ${user} does not exist."
fi
Sometimes when password rules are very strict, a user may have problems creating a new password that is both easy to remember, and still adheres to the password rules. To aid the user, it could be useful to clear the password history for his or her account, so he or she can re-use a certain password that has been used in the past. The password history is stored in /etc/security/pwdhist.pag and /etc/security/pwdhist.dir. The command you can use to disable the password history for a user is:
# chuser histsize=0 username
Actually, this command does not the password history in /etc/security/pwdhist.dir and /etc/security/pwdhist.pag, but only changes the setting of histsize for the account to zero, meaning, that a user is not checked again on re-using old passwords. After the user has changed his or her password, you may want to set it back again to the default value:
# grep -p ^default /etc/security/user | grep histsize
histsize = 20
# chuser histsize=20 username
In older AIX levels, this functionality (to use chuser histsize=0) would actually have cleared out the password history of the user. In later AIX levels, this functionality has vanished.
So, if you truely wish to delete the password history for a user, here's another way to clear the password history on a system: It is accomplished by zeroing out the pwdhist.pag and pwdhist.dir files. However, this results in the deletion of all password history for all users on the system:
# cp /dev/null /etc/security/pwdhist.pag
# cp /dev/null /etc/security/pwdhist.dir
Please note that his is a temporary measure. Once these files are zeroed out, as soon as a user changes his or her password again, the old password is stored again in these files and it can't be reused (unless the histsize attribute for a user is set to 0).
Usually, with the default settings used with NMON, along with using PuTTY on a Windows system, you may notice that the boxes and lines in NMON are not displayed correctly. It may look something like this:
An easy fix for this issue is to change the character set translation within PuTTY. In the upper left corner of your PuTTY window, click the icon and select "Change Settings". Then navigate to Window -> Translation. In the "Remote character set" field, change "UTF-8" to "ISO-8859-1".
Once changed, restart PuTTY and it should something like this:
Another option is to stop using boxes and lines altogether. You can do this by starting nmon with the -B option:
# nmon -B
Or you can set the NMON environment variable to the same:
# export NMON=B
# nmon
Here's a procedure how you can add additional swap space to a running RHEL system.
This procedure assumes you will want to add 8 Gigabytes of swap space, and we will be using LVM to do so. To get information from Red Hat on recommended swap space sizes, take a look here: https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/5/html/Deployment_Guide/ch-swapspace.html.
First start by checking what the current swap space size is, by using the free command:
# free -m -t
total used free shared buffers cached
Mem: 129013 124325 4688 9 173 97460
-/+ buffers/cache: 26691 102322
Swap: 16383 8057 8326
Total: 145397 132382 13015
This particular system has 16 GB of swap space (look in the "total" column next to "Swap:"). Using the -m option with the free command displays the memory values in megabytes. Using the -t option will provide the totals.
You can also see that the system has used 8057 MB of it's swap space, almost half of the swap space available.
Then, figure out how the current swap spaces are configured now:
# cat /proc/swaps
Filename Type Size Used Priority
/dev/dm-1 partition 8388604 8262740 -1
/dev/dm-8 partition 8388604 0 -2
This shows that there are 2 paging spaces of 8 GB each. To increase the swap space on the system, we'll add another swap space of 8 GB, so the total swap space will go up to 24 GB.
To get a view of what logical volumes exist on the system, use the dmsetup command:
# dmsetup ls
rootvg00-optlv00 (253:7)
rootvg00-tmplv00 (253:3)
rootvg00-varlv00 (253:2)
rootvg00-homelv00 (253:6)
rootvg00-rootlv00 (253:0)
rootvg00-usrlocallv00 (253:5)
rootvg00-swaplv01 (253:8)
rootvg00-usrlv00 (253:4)
rootvg00-swaplv00 (253:1)
This shows that there are 2 logical volumes, swaplv00, and swaplv01. We'll create swaplv02 as the third swap space on the system.
Another good way to see the same information, is by using the lvs command:
# lvs 2>/dev/null
LV VG Attr LSize
homelv00 rootvg00 -wi-ao---- 10.00g
optlv00 rootvg00 -wi-ao---- 8.00g
rootlv00 rootvg00 -wi-ao---- 2.00g
swaplv00 rootvg00 -wi-ao---- 8.00g
swaplv01 rootvg00 -wi-ao---- 8.00g
tmplv00 rootvg00 -wi-ao---- 5.00g
usrlocallv00 rootvg00 -wi-ao---- 1.00g
usrlv00 rootvg00 -wi-ao---- 5.00g
varlv00 rootvg00 -wi-ao---- 4.00g
This gives you the information that the logical volumes have been created in the rootvg00 volume group. We'll create the new swap space in the same volume group, using the lvcreate command:
# lvcreate -n swaplv02 -L 8G rootvg00
Logical volume "swaplv02" created
Using the -n option of the lvcreate command, you can specify the name of the logical volume. The -L option specifies the size (in this case 8G), and you end the command with the volume group name.
Next, you'll have to tell RHEL that the new logical volume is to be formatted for swap space usage:
# mkswap /dev/rootvg00/swaplv02
Setting up swapspace version 1, size = 8388604 KiB
no label, UUID=c9be43f7-c473-45ae-ba13-c1e09af2d95e
Then, you'll have to add an entry to /etc/fstab, so the system knows to re-use the swap space after a system reboot:
# grep swap /etc/fstab
/dev/mapper/rootvg00-swaplv00 swap swap defaults 0 0
/dev/mapper/rootvg00-swaplv01 swap swap defaults 0 0
/dev/mapper/rootvg00-swaplv02 swap swap defaults 0 0
Finally, activate the new swap space using the swapon command:
# swapon -v /dev/rootvg00/swaplv02
swapon on /dev/rootvg00/swaplv02
swapon: /dev/mapper/rootvg00-swaplv02: found swap signature: version 1, page-size 4, same byte order
swapon: /dev/mapper/rootvg00-swaplv02: pagesize=4096, swapsize=8589934592, devsize=8589934592
To validate that the new swap space is available on the system, use the free command again, and you may also review /proc/swaps:
# free -m -t
total used free shared buffers cached
Mem: 129013 121344 7669 9 175 95575
-/+ buffers/cache: 25593 103420
Swap: 24575 8109 16466
Total: 153589 129453 24136
# cat /proc/swaps
Filename Type Size Used Priority
/dev/dm-1 partition 8388604 8303856 -1
/dev/dm-8 partition 8388604 0 -2
/dev/dm-9 partition 8388604 0 -3
That's it; you're done!
Number of results found: 469.
Displaying results: 111 - 120.