Unix Tip: Terminating unattended processes

March 26, 2008, 03:59 PM —  ITworld.com — 

A reader recently asked how he could most easily terminate processes that were left running after his users had logged off a system. The processes in question were apparently consuming resources needed by other services and were not contributing to any particular project. What he was looking for, however, was a solution that would not require him to detect the processes and manually terminate them, even if commands such as "pkill -u username" might simplify the job.

The possibility of using some form of .logout file crossed my mind. However, .logout files don't appear to be universally effective and, by nature, belong to the individual users. To be effective, only a tool that would not be subject to the prerogative of the users in question would do.

The first step in identifying a good solution to this problem was selecting the processes to be terminated. If a user logged in to the system again, we would not want to terminate his current login. Processes that continue running after a user has logged out lose their assigned terminal. That is, they are listed as running on TTY "?" like jdoe's process in the listing below.

     UID   PID  PPID  C    STIME TTY      TIME CMD
    root     0     0  0   Jun 27 ?        0:00 sched
    jdoe 15540     1  0 16:30:17 ?        0:00 /bin/bash /home/jdoe/job

Numerous critical system processes, as also illustrated in this output, are not associated with particular TTYs. The scheduler, a critical system process, is just one of many processes that must be running for a system to function properly. If you run a ps command to count the number of such processes, you will see something like this:

# ps -ef -o tty | grep "?" | wc -l
      60

Clearly, we need to be very selective about the processes we kill.

So, let's say we want to terminate all of jdoe's processes that are not associated with a current login session. We want something that uses the logic "select PID where UID=jdoe and TTY=?".

boson# ps -ef -o user,pid,tty | grep jdoe
jdoe 15540 ?
jdoe 16421 ?
jdoe 14437 ?
jdoe 15790 pts/1
jdoe 14439 pts/1

If we further select the lines containing "?" characters and then narrow down the output to the middle column, we're almost there.

boson# ps -ef -o user,pid,tty | grep jdoe | grep "?" | awk '{print $2}'
15540
16646
14437

The only problem with this approach is that there's a small risk of including some other user's processes in the mix of the second username includes the first username. For example, mjdoe's unattended processes processes would also be selected.

Another approach is to use a different selection process as I did in the script I eventually sent to the reader. This script, included below, tosses information on all of the user's processes into a temporary file and then parses each line, looking for an exact match on the username and TTY. In addition, it adds a line to a log file showing the date/time, username and command that was terminated.

#!/bin/ksh
#
# killprox: kill unattended processes by username

# ask for username
if [ $# != 1 ]; then
    echo "username> \c"
    read username
else
    username=$1
fi

# gather info on user procs
ps -ef | grep $username > /tmp/procs$$

# kills procs where TTY="?" (i.e., login session was closed)
while read line
do
    echo $line | read U P x x x T cmd
    [ $U == $username ] || continue
    if [ $T == "?" ]; then
	if [ -t 1 ]; then
	    echo killing $P
	fi
	kill $P
        echo `date` "$username $cmd" >> /var/log/killprox.log
    fi
done < /tmp/procs$$

rm /tmp/procs$$

The first line within the while loop breaks each line in the "ps -ef" output into a series of fields. Those fields that are not of interest are assigned to "x". U is assigned the username, P the process ID, T the tty and cmd the remainder of the line (the command and any arguments).

We move to the next line of ps output (i.e., continue) if the username doesn't match. If the tty is "?", we kill the process. If the process is run interactively, however, we first tell the person running it what we are doing. We then make an entry in the killprox.log file.

The temporary file, created early in the script to capture a list of the user's processes is removed at the end.

A possible improvement to this script would be examine the return code from the kill command to verify that the process was actually killed.

If you want to terminate unattended processes for any or all users, you could call the killprox script from another script which creates a list of currently active users (whether logged in or not). Note that we carefully avoid running the process against a series of system users such as root, daemon and nobody since we don't want to inadvertently terminate system processes, Apache daemons and the like.

#!/bin/ksh

for U in `ps -ef | awk '{print $1}' | sort | uniq`
do
    case $U in
        UID|daemon|nobody|root|smmsp) continue;;
	*) killprox $U
    esac
done

Now that we've looked at ways we can terminate unattended processes without accidentally killing processes that we need or that belong to current sessions, a word of caution is in order. Any legitimate user's processes should be considered valid use of system resources unless you have very good reason to conclude otherwise. Always exercise good judgment when you wield the power of root over other people's computer use.

ITworld.com

I like it!
Post a comment
The content of this field is kept private and will not be shown publicly.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.
Resources
White Paper

Symantec Backup Exec 12 and Backup Exec System Recovery 8 deliver industry leading Windows data protection and system recovery. Download this whitepaper to find out the top reasons to upgrade and how to get continuous data protection and complete system recovery.

Webcast

Data and system loss — from a hard drive failure, malicious attack, natural disaster, or simple human error — can happen anytime. Don’t leave your business vulnerable. Make sure you have a secure recovery strategy in place. Symantec's latest backup and system recovery technology can efficiently restore critical applications, individual emails and documents and even restore your entire system in minutes in the event of a loss.

White Paper

Businesses face a growing challenge to ensure that the IT environment is properly protected. Backup Exec 12 integrates with other applications in the Symantec family of products, to complement your current data protection strategy, keep your data securely backed up and make it recoverable when you need it most.

Free stuff

Enterprise 2.0 Implementation
By Aaron C. Newman, Jeremy Thomas
Published by McGraw-Hill
Learn more!

Deploying Cisco Wide Area Application Services
By Zach Seils, Joel Christner
Published by Cisco Press
Learn more!

Featured Sponsor

AISO founders envisioned a Web hosting company that was environmentally friendly. While the company employed energy-efficient innovations like solar panels, its infrastructure produced unacceptable power and cooling requirements. Find out how AISO leveraged AMD technology to overcome their challenge in this case study white paper.

In this whitepaper, Scalar explores the opportunity to change the landscape with respect to mission critical databases built around Oracle. Leveraging technologies such as Linux, high-end commodity processing power and Oracle RAC technology to architect, design, build and maintain database infrastructure that delivers maximum availability, reliability and performance at a fraction of traditional cost.

On a typical day, weather.com, the Web site for The Weather Channel in Atlanta, serves up between 15 million and 20 million page views. But in September 2004, when back-to-back hurricanes ransacked Florida, the peak traffic on one day more than tripled: over 70 million page views by more than 7 million unique visitors. Read the full success story now.

More Resources