UNIX Health Check - System Admin

Topics: AIX, Monitoring, Networking, Red Hat / Linux, Security, System Admin

Determining type of system remotely

If you run into a system that you can't access, but is available on the network, and have no idea what type of system that is, then there are few tricks you can use to determine the type of system remotely.

The first one, is by looking at the TTL (Time To Live), when doing a ping to the system's IP address. For example, a ping to an AIX system may look like this:

# ping 10.11.12.82
PING 10.11.12.82 (10.11.12.82) 56(84) bytes of data.
64 bytes from 10.11.12.82 (10.11.12.82): icmp_seq=1 ttl=253 time=0.394 ms
...

TTL (Time To Live) is a timer value included in packets sent over networks that tells the recipient how long to hold or use the packet before discarding and expiring the data (packet). TTL values are different for different Operating Systems. So, you can determine the OS based on the TTL value. A detailed list of operating systems and their TTL values can be found here. Basically, a UNIX/Linux system has a TTL of 64. Windows uses 128, and AIX/Solaris uses 254.

Now, in the example above, you can see "ttl=253". It's still an AIX system, but there's most likely a router in between, decreasing the TTL with one.

Another good method is by using nmap. The nmap utility has a -O option that allows for OS detection:

# nmap -O -v 10.11.12.82 | grep OS
Initiating OS detection (try #1) against 10.11.12.82 (10.11.12.82)
OS details: IBM AIX 5.3
OS detection performed.

Okay, so it isn't a perfect method either. We ran the nmap command above against an AIX 7.1 system, and it came back as AIX 5.3 instead. And sometimes, you'll have to run nmap a couple of times, before it successfully discovers the OS type. But still, we now know it's an AIX system behind that IP.

Another option you may use, is to query SNMP information. If the device is SNMP enabled (it is running a SNMP daemon and it allows you to query SNMP information), then you may be able to run a command like this:

# snmpinfo -h 10.11.12.82 -m get -v sysDescr.0
sysDescr.0 = "IBM PowerPC CHRP Computer
Machine Type: 0x0800004c Processor id: 0000962CG400
Base Operating System Runtime AIX version: 06.01.0008.0015
TCP/IP Client Support  version: 06.01.0008.0015"

By the way, the example for SNMP above is exactly why UNIX Health Check generally recommends to disable SNMP, or at least to dis-allow providing such system information trough SNMP by updating the /etc/snmpdv3.conf file appropriately, because this information can be really useful to hackers. On the other hand, your organization may use monitoring that relies of SNMP, in which case it needs to be enabled. But then you stil have the opportunity of changing the SNMP community name to something else (the default is "public"), which also limits the remote information gathering possibilities.

Topics: AIX, System Admin ↑

Resolving IBM.DRM software errors

If you see several SRC_RSTRT errors in the error report regarding IBM.DRM or IBM.AuditRM, using identifiers CB4A951F or BA431EB7, and detecting module "srchevn.c", then you are probably having a system that has been cloned in the past from another system, and the RSCT software is using the keys of the original system.

The solution is this:

# /usr/sbin/rsct/bin/rmcctrl -z 
# /usr/sbin/rsct/bin/rmcctrl -d 
# /usr/sbin/rsct/install/bin/recfgct -s 
# /usr/sbin/rsct/bin/rmcctrl -A 
# /usr/sbin/rsct/bin/rmcctrl -p

This will generate new keys, and will solve the errors in the error report. Just to make sure, reboot your system, and they should no longer show up in the error report after the reboot.

Topics: Red Hat / Linux, System Admin ↑

RHSM: Too many content sets for certificate

How to fix subscription-manager error "Too many content sets for certificate Red Hat Enterprise Linux Server" using RHN and be able to revert back to Red Hat Subscription Management after updating.

Step 1: Clean up the subscription-manager if needed:

# subscription-manager unsubscribe --all
# subscription-manager unregister
# subscription-manager clean

Step 2: Register to Red Hat Network (RHN) using rhn_register:

# rhn_register

Note: You will need your RH login and password to complete the wizard.

Step 3: Validate RHN registration of the system:

# yum repolist

Note: Look at Loaded plugins in the output and "rhnplugin" should be listed.

Step 4: Update subscription-manager* and python-rhsm* packages: # yum list updates subscription-manager* python-rhsm* Note: The output may vary depending on your system and installed packages.

Example output below:

Updated Packages
python-rhsm.x86_64 1.12.5-2.el6 rhel-x86_64-server-6
subscription-manager.x86_64 1.12.14-9.el6_6 rhel-x86_64-server-6
subscription-manager-firstboot.x86_64 1.12.14-9.el6_6 rhel-x86_64-server-6
subscription-manager-gnome.x86_64 0.99.19.4-1.el6_3 rhel-x86_64-server-6
# yum update subscription-manager* python-rhsm*

Note: Answer the questions when prompted. Validate the updates were applied successfully by examining the output.

Step 5: Unregister from RHN in preparation to register with subscription-manager:

In the online Red Hat Portal, login.
Access Subscription Management.
Access RHN Classic Management -> All Registered Systems.
Click on System Entitlements (you need to see check boxes next to systems).
Select the check box next to the system you are working on.
Click the "Unentitle" button at bottom middle of page.
Validate the entitlement has been removed for the system.
Perform the below command on the system's CLI:
# rm /etc/sysconfig/rhn/systemid

Step 6: Register system with subscription-manager:

Note: Validate that no subscriptions are showing active.

# subscription-manager list --available

Note: A message similar to below should be displayed.

This system is not yet registered. Try 'subscription-manager register --help' for more information.

# subscription-manager register --username=xxxxxx --password='xxxxxx'

Note: You will need your Red Hat Portal Username and Password for the account the system will be registered under. Make note of the ID that the system will be registered when this command returns.

Validate that the subscription-manager plugin is loaded

# yum repolist

Look at Loaded plugins in the output where "subscription-manager" should be listed.

Validate that subscriptions are showing available now:

# subscription-manager list --available

Validate the Subscription Name, SKU, Contract, Account and Pool ID are showing up correctly. Make note of the "Pool ID" that will be required to subscribe in the next task. Register the system using one of the pools above:

# subscription-manager subscribe --pool='[POOL_ID_Number]'

Note: Where "[POOL_ID_Number]" should be obtained from the preceding task.

Make sure a message stating "Successfully attached a subscription for" the system is shown.

Step 7: Validate that the system is now consuming a subscription:

# subscription-manager list --consumed

Validate the Subscription Name, SKU, Contract, Account and Pool ID are correct.

# subscription-manager list

Note: The Status should show "Subscribed".

Step 8: Validate in Red Hat Portal that the new system shows up as well.

In Red Hat Portal:

In the online Red Hat Portal, login.
Access Subscription Management.
Access Red Hat Subscription Management -> Subscriber Inventory -> Click on Systems.
Examine the Systems inventory to validate the new system is now visible and shows a subscription attached.

Topics: AIX, Red Hat / Linux, Security, System Admin ↑

System-wide separated shell history files for each user and session

Here's how you can set up your /etc/profile in order to create a separate shell history file for each user and each login session. This is very useful when you need to know who exactly ran a specific command at a point in time. For Red Hat Linux, put the updates in either /etc/profile or /etc/bashrc.

Put this in /etc/profile on all servers:

# HISTFILE
# execute only if interactive
if [ -t 0 -a "${SHELL}" != "/bin/bsh" ]
then
d=`date "+%H%M.%m%d%y"`
t=`tty | cut -c6-`
u=`who am i | awk '{print $1}'`
w=`who -ms | awk '{print $NF}' | sed "s/(//g" | sed "s/)//g"`
y=`tty | cut -c6- | sed "s/\//-/g"`
mkdir $HOME/.history.$USER 2>/dev/null
export HISTFILE=$HOME/.history.$USER/.sh_history.$USER.$u.$w.$y.$d
find $HOME/.history.$USER/.s* -type f -ctime +91 -exec rm {} \; 2>/dev/null

H=`uname -n | cut -f1 -d'.'`
mywhoami=`whoami`
if [ ${mywhoami} = "root" ] ; then
PS1='${USER}@(${H}) ${PWD##/*/} # '
else
PS1='${USER}@(${H}) ${PWD##/*/} $ '
fi
fi

# Time out after 60 minutes
# Use readonly if you don't want users to be able to change it.
# readonly TMOUT=3600
TMOUT=3600
export TMOUT

When using ksh, put this in /etc/environment, to turn on time stamped history files:

# Added for extended shell history
EXTENDED_HISTORY=ON

When using bash, put this in /etc/bashrc, to enable time-stamped output when running the "history" command:

HISTTIMEFORMAT='%F %T '; export HISTTIMEFORMAT

This way, *every* user on the system will have a separate shell history in the .history directory of their home directory. Each shell history file name shows you which account was used to login, which account was switched to, on which tty this happened, and at what date and time this happened.

Shell history files are also time-stamped internally. For AIX, you can run "fc -t" to show the shell history time-stamped. For Red Hat, you can run: "history". Old shell history files are cleaned up after 3 months, because of the find command in the example above. Plus, user accounts will log out automatically after 60 minutes (3600 seconds) of inactivity, by setting the TMOUT variable to 3600. You can avoid running into a time-out by simply typing "read" or "\" followed by ENTER on the command line, or by adding "TMOUT=0" to a user's .profile, which essentially disables the time-out for that particular user.

One issue that you now may run into on AIX, is that because a separate history file is created for each login session, that it will become difficult to run "fc -t", because the fc command will only list the commands from the current session, and not those written to a different history file. To overcome this issue, you can set the HISTFILE variable to the file you want to run "fc -t" for:

# export HISTFILE=.sh_history.root.user.10.190.41.116.pts-4.1706.120210

Then, to list all the commands for this history file, make sure you start a new shell and run the "fc -t" command:

# ksh "fc -t -10"

This will list the last 10 commands for that history file.

Topics: Red Hat / Linux, System Admin ↑

Install GNOME GUI on RHEL 7 Linux Server

If you have performend a RHEL 7 Linux Server installation and did not include Graphical User Interface (GUI) you can do it later directly from command line using yum command and selecting an appropriate installation group. To list all available installation groups on Redhat 7 Linux use:

# yum group list

From the above list select Server with GUI installation group:

# yum groupinstall 'Server with GUI'

Just because gnome desktop environment is a default GUI on RHEL 7 linux system the above command will install gnome. Alternatively, you can run the below command to only install core GNOME packages:

# yum groupinstall 'X Window System' 'GNOME'

Once the installation is finished, you need to change system's runlevel to runlevel 5. Changing runlevel on RHEL 7 is done by use of systemctl command. The below command will change runlevel from runlevel 3 to runelevel 5 on RHEL 7:

# systemctl enable graphical.target --force

Depending on your previous installations you may need to accept Redhat License after you reboot your system. Once you boot to your system you can check GNOME version using:

# gnome-shell --version

Source: http://linuxconfig.org/install-gnome-gui-on-rhel-7-linux-server.

Topics: Red Hat / Linux, System Admin ↑

How to create Local Repositories in RHEL

This is a short procedure that will tell you how to set up a local repository (repo) for use by the yum command, to install packages from onto your system. In this procedure, we assume you have the RHEL installation DVD inserted into your virtual or physical drive.

Mount the drive:

# mkdir /cdrom
# mount /dev/cdrom /cdrom

Then create the repo file in /etc/yum.repos.d, called local.repo:

# cd /etc/yum.repos.d
# vi local.repo
[local]
name=Local Repo
baseurl=file:////cdrom
enabled=1
gpgcheck=0
protect=1

From now on you can use this local repository to install software, such as wireshark:

# yum install wireshark

Topics: AIX, System Admin ↑

Cron jobs running late or not at all

If any of your cron jobs are running late, or not at all, check the cron log (/var/adm/cron/log) to see if there are any errors or other messages around the time the jobs should run.

If you see messages like this:

! c queue max run limit reached Fri Sep 20 13:15:00 2013 ! rescheduling a cron job Fri Sep 20 13:15:00 2013

The reason the jobs are not running is that there are too many simultaneous jobs at the time the daemon tries to run a new job.

The cron daemon has a limit of how many jobs it will run simultaneously. By default it is 100 jobs. If a new job is scheduled to run and the limit has already been reached the job will be rescheduled at a later time (the default is 60 seconds later). Both the number of jobs and wait time are configured in the file /var/adm/cron/queuedefs.

If it is unusual for cron to be running so many jobs, you can check the process table to view the jobs cron has created. These jobs will have parent process id (PPID) of the cron daemon.

# ps -ef | grep cron | grep -v grep
  root  2097204   1   0   Dec 02  -  0:33 /usr/sbin/cron

# ps -T 2097204
      PID    TTY  TIME CMD
  2097204      -  0:33 cron
 17760598      -  0:00     \--ksh
 18153488      -  0:16         \--find

In the example above the cron daemon has 1 child job, which is a shell, and that shell (possibly running a script) is running the "find" command. This would count as 1 direct descendent job from cron.

If you find many of the same job stuck there may be a problem with the script or command being run. The command or script should be checked from a shell prompt to see if it completes successfully.

If the large number of jobs are naturally occurring as a result of increased workload on the system, you may need to change the values in the queuedefs file and increase them from their defaults.

To do this, add an entry to the bottom of the queuedefs file using an editor such as vi. The entry should have the form:

c.50j20n60w

Where:
c = The "c" or cron queue
Nj = The maximum number of jobs to be run simultaneously by cron.
Nn = The "nice" value of the jobs to be run (default is 2).
Nw = The time a job has to wait until the next attempt to run it.

For example:

c200j2n60w

This example would set the cron queue to a maximum of 200 jobs, with a nice value of 2, and a wait time of 60 seconds.

It is not necessary to restart cron after modifying the queuedefs file, it will be automatically checked by cron's event loop.

Source: IBM Technote http://www-01.ibm.com/support/docview.wss?uid=isg3T1020382

Topics: AIX, Monitoring, System Admin ↑

NMON recordings

One can set up NMON recordings from smit via:

# smitty topas -> Start New Recording -> Start local recording -> nmon

However, the smit panel doesn't list the option needed to get disk IO service times. Specifically, the -d option to collect disk IO service and wait times. Thus, it's better to use the command line with the nmon command to collect and report these statistics. Here's one set of options for collecting the data:

# nmon -AdfKLMNOPVY^ -w 4 -s 300 -c 288 -m /var/adm/nmon

The key options here include:

-d Collect and report IO service time and wait time statistics.
-f Specifies that the output is in spreadsheet format. By default, the command takes 288 snapshots of system data with an interval of 300 seconds between each snapshot. The name of the output file is in the format of hostname_YYMMDD_HHMM.nmon.
-O Includes the Shared Ethernet adapter (SEA) VIOS sections in the recording file.
-V Includes the disk volume group section.
-^ Includes the FC adapter section (which also measures NPIV traffic on VIOS FC adapters).
-s Specifies the interval in seconds between 2 consecutive recording snapshots.
-c Specifies the number snapshots that must be taken by the command.

Running nmon using this command will ensure it runs for a full day. And it is therefore useful to start nmon daily using a crontab entry in the root crontab file. For example, using the following script:

# cat /usr/local/collect_nmon.ksh
#!/bin/ksh

LOGDIR="/var/adm/nmon"
PARAMS="-fTNAdKLMOPVY^ -w 4 -s 300 -c 288 -m $LOGDIR"

# LOGRET determines the number of days to retain nmon logs.
LOGRET=365

# Create the nmon folder.
if [ ! -d /var/adm/nmon ] ; then
        mkdir -p $LOGDIR
fi

# Compress previous daily log.
find $LOGDIR -name *.nmon -type f -mtime +1 -exec gzip '{}' \;

# Clean up old logs.
find $LOGDIR -name *nmon.gz -type f -mtime +$LOGRET -exec rm '{}' \;

# Start nmon.
/usr/bin/nmon $PARAMS

Then add the following crontab entry to the root crontab file:

0 0 * * * /usr/local/collect_nmon.ksh >/tmp/collect_nmon.ksh.log 2>&1

To get the recordings thru the NMON Analyser tool (a spreadsheet tool that runs on PCs and generates performance graphs, other output, and is available here), it's recommended to keep the number of intervals less than 300.

Topics: AIX, Security, System Admin ↑

Avoid using env_reset in sudoers file

By default, when using sudo, the env_reset sudo option is enabled.

From the sudoers manual, about the env_reset sudo option:

This causes commands to be executed with a new, minimal environment. On AIX the environment is initialized with the contents of the /etc/environment file. The new environment contains the TERM, PATH, HOME, MAIL, SHELL, LOGNAME, USER, USERNAME and SUDO_* variables in addition to variables from the invoking process permitted by the env_check and env_keep options. This is effectively a whitelist for environment variables.

If, however, the env_reset option is disabled, any variables not explicitly denied by the env_check and env_delete options are inherited from the invoking process. In this case, env_check and env_delete behave like a blacklist. Since it is not possible to blacklist all potentially dangerous environment variables, use of the default env_reset behavior is encouraged.

In all cases, environment variables with a value beginning with () are removed as they could be interpreted as bash functions. The list of environment variables that sudo allows or denies is contained in the output of "sudo -V" when run as root.

So, what does this all mean? Well, it means that you should not use env_reset in the /etc/sudoers file.

First of all, if you would use:

Defaults env_reset

Then that would do you no good, because the default is already to reset the environment variables.

If you would use (notice the exclamation mark before env_reset):

Defaults !env_reset

Then it means you don't reset any environment variables from the invoking process, for ALL users. That is a security risk, as sudo will preserve variables such as PATH or LD_LIBRARY, and these variables can be configured with values such as "." or "/home/username", or they can be utilized by malicious software.

With the default env_reset all sudo sessions will invoke a shell with minimum shell variables, including those set in /etc/profile and some others if specified in sudoers file (using the env_keep option). So this will make a more controlled sudo access without bypassing sudo security restrictions.

Okay, so what if you need to run a command through sudo that requires a certain environment variable? A good example is the tcpdump command. When running tcpdump via sudo, you may encounter the following error message:

$ sudo tcpdump -i en12
tcpdump: bpf_load: genmajor failed: A file or directory in the path name does not exist.

In this case, tcpdump is known to require the ODMDIR environment variable to be set. One way is to use "Defaults !env_reset" in /etc/sudoers, but the sudoers manual above explains that this is discouraged. Another method is to allow only specific users in /etc/sudoers, by disabling env_reset, such as:

User_Alias           UTCPDUMP = tim, john
Defaults:UTCPDUMP    !env_reset

But this still allows specific users to "play" with all environment variables. So unless you trust these users very much, an even better way is to use the env_keep sudo option, to specify the environment variables that need not be reset (that is, if you know the correct environment variables that are required). In the case of the tcpdump command, we will want to retain the ODMDIR environment variable:

Defaults env_keep += ODMDIR

With the above line in /etc/sudoers, you will notice that running the tcpdump command via sudo will now work properly.

So, the bottom line is: Don't use env_reset at all in /etc/sudoers. If really necessary, use env_reset for only specific users, or even better, specify the required environment variables using env_keep.

Of course, the UNIX Health Check software will check if env_reset is used in /etc/sudoers, and if so, warn about this potential security risk.

Topics: AIX, Security, System Admin ↑

Difference between sticky bit and SUID/GUID

This is probably one of things that people mess up all the time. They both have to do with permissions on a file, but the SUID/GUID (or SETUID short for set-user-id/SETGID short for set-group-id) bit and the sticky-bit are 2 completely different things.

The SUID/GUID

The letters rwxXst select file mode bits for users:

read (r)
write (w)
execute (or search for directories) (x)
execute/search only if the file is a directory or already has execute permission for some user (X)
set user or group ID on execution (s)
restricted deletion flag or sticky bit (t)

The position that the x bit takes in rwxrwxrwx for the user octet (1st group of rwx) and the group octet (2nd group of rwx) can take an additional state where the x becomes an s. When this file when executed (if it's a program and not just a shell script), it will run with the permissions of the owner or the group of the file. That is called the SUID, when set for the user octet, and GUID, when set for the group octet.

So if the file is owned by root and the SUID bit is turned on, the program will run as root. Even if you execute it. The same thing applies to the GUID bit. You can set or clear the bits with symbolic modes like u+s and g-s, and you can set (but not clear) the bits with a numeric mode.

SUID/GUID examples

No SUID/GUID: Just the bits rwxr-xr-x are set:

# ls -lt test.pl -rwxr-xr-x 1 root root 179 Jan 9 01:01 test.pl

SUID and user's executable bit enabled (lowercase s): The bits rwsr-x-r-x are set.

# chmod u+s test.pl
# ls -lt test.pl
-rwsr-xr-x 1 root root 179 Jan  9 01:01 test.pl

SUID enabled and executable bit disabled (uppercase S): The bits rwSr-xr-x are set.

# chmod u-x test.pl
# ls -lt test.pl 
-rwSr-xr-x 1 root root 179 Jan  9 01:01 test.pl

GUID and group's executable bit enabled (lowercase s): The bits rwxr-sr-x are set.

# chmod g+s test.pl
# ls -lt test.pl 
-rwxr-sr-x 1 root root 179 Jan  9 01:01 test.pl

GUID enabled and executable bit disabled (uppercase S): The bits rwxr-Sr-x are set.

# chmod g-x test.pl
# ls -lt test.pl 
-rwxr-Sr-x 1 root root 179 Jan  9 01:01 test.pl

The sticky bit

The sticky bit on the other hand is denoted as a t, such as with the /tmp or /var/tmp directories:

# ls -ald /tmp
drwxrwxrwt 36 bin bin 8192 Nov 27 08:40 /tmp
# ls -ald /var/tmp
drwxrwxrwt  3 bin bin  256 Nov 27 08:28 /var/tmp

This bit should have always been called the "restricted deletion bit" given that's what it really denotes. When this mode bit is enabled, it makes a directory such that users can only delete files and directories within it that they are the owners of. For regular files the bit was used to save the program in swap device so that the program would load more quickly when run; this is called the sticky bit, but it's not used anymore in AIX.

More information can be found in the manual page of the chmod command or on http://en.wikipedia.org/wiki/Sticky_bit.

Number of results found for topic System Admin: 249.
Displaying results: 41 - 50.

Order

No time to lose? Need to know what's wrong with
your UNIX system now? Then get started TODAY!