Send in your Unix questions today! |
See additional Unix tips and tricks
Analog, the free tool for web log analysis, provides considerable insight into web traffic, in both tabular and graphic form, and is one of the most popular tools for evaluating the health and the success of web sites today. Analog is not just free, however. It's also extremely fast, very easy to set up and use, likely to compile on any operating system you throw it on and able to provide reports in more than thirty languages. This is an astounding set of advantages and one that very few commercial packages come close to attaining.
For those of us who use Analog routinely to understand how well our web sites are doing, a little automation goes a long way. To prepare monthly reports, for example, you will want to rotate your log files, keeping each month's log data separately. Whether you retain old log files or only the Analog reports, you can set up for your sites in a way that lends itself to month by month comparisons.
If you use Analog and would like to keep many months' worth of reports online, however, you need to be sure that you don't overwrite one month's reports with the next. So, in this week's column, we're going to look at a simple script that facilitates running analog for any of a series of log files and uniquely names the resulting HTML and PNG files so that each month's reports are separate from the rest.
The runAnalog script was written to expect log files to include the month and year in their names and to be compressed. For example, the log file for last month would be named Oct2006.gz. At any point in time, the logs directory might contain numerous compressed monthly files and the current access and error files, as shown here:
Apr2006.gz Feb2006.gz Jun2005.gz Nov2005.gz Sep2006.gz
Aug2005.gz Jan2006.gz Jun2006.gz Oct2005.gz access_log
Aug2006.gz Jul2005.gz Mar2006.gz Oct2006.gz error_log
Dec2005.gz Jul2006.gz May2006.gz Sep2005.gz
The script then uses the select command to create an instant menu of the log files available for analyzing and then uses the selected log file's name to generate names for the report and associated image files.
The script optionally uncompresses the selected log file, sending errors to /dev/null (in case the file is not compressed in the first place).
The script renames each of the generated graphics files with the same month-year naming convention that the log files use. More precisely, it will duplicate any naming convention that you use, since it creates the new file names by copying the primary (minus the file extension) name of the selected log file. If your log files are named like those shown above, for example, you will end up with image files with names like orgOct2006.png, sizeOct2006.png and so on.
#!/bin/bash
#
# runAnalog
# paths
APAHOME=/opt/apache
APADOCS=$APAHOME/htdocs
APAPICS=$APAHOME/htdocs/images
cd $APAHOME
echo "Select the log file to process."
select logfile in `ls logs`
do
if [ "$logfile" == "" ]; then
echo "not a valid selection"
else
# uncompress file, but don't balk if it is not in compressed form
gunzip logs/$logfile 2> /dev/null
MOYR=`echo $logfile | sed "s/.gz$//"`
# run Analog
./analog logs/$MOYR 2> /dev/null
if [ $MOYR == "access_log" ]; then # current file
MOYR=`date +%b%Y`
fi
# rename image files, adding month and year
for image in code dir org req size type
do
if [ -f $image.png ]; then
mv $image.png $APAPICS/${image}${MOYR}.png
perl -p -i -e "s,$image.png,images/${image}${MOYR}.png," report.html
fi
done
mv report.html $APADOCS/$MOYR.html
break
fi
done
# re-compress the logfile
gzip logs/$MOYR
If you use logrotate or a similar tool to separate your log data each month, you can easily create a series of monthly Analog reports. You could also take a step further toward automation by running a script similar to runAnalog that accepts an argument for the log file and runs at the end of each month through cron.