[plug] Bash scripting suggestions required :}

Phillip Twiss phillip.twiss at det.wa.edu.au
Wed Sep 10 13:20:13 WST 2008


G'Day All

	I've got a quick and probably really dumb question someone may be able to help me with..

Background

	We have a whole series of httpd log files that we need to feed to webalizer, they all live in the same directory but have different sitename at the front of their log file names ( e.g. www.det.wa.edu.au_access.log, www.det.wa.edu.au_access.log.1 etc etc, note the underscore )

	I have developed a quick and dirty bash script that locates unique instances of the server name ( i.e. www.det.wa.edu.au ), then processes that list one site at a time getting all its historical files and processing them in the correct order.

My Question  -  

	The initial loop that gets all the unique instances is very slow, I am sure there is much simpler/faster/more effective ways of achieving this loop.  The performance problem seems to come with searching the growing list for previous instances.

SITELIST=`ls -t1r $BASELOG/revproxy|grep access|cut -f1 -d"_"`
WORKLIST=""
for TESTSITE in $SITELIST; do
    EXISTFLAG=0
    for WORKSITE in $WORKLIST; do
        if [ $WORKSITE = $TESTSITE ]
        then
            EXISTFLAG=1
        fi
    done
    if [ $EXISTFLAG = 0 ]
    then
        WORKLIST=$WORKLIST" "$TESTSITE
    fi
done


	I am hoping someone in the list with more current bash skills than me may see something and give me some performance hints ( i.e. why don't you just use !<> or something )

	Any and all advise welcomed :}

	Reproduced below is the script in its entirety, feel free to use it however you will :}

	Regards

	Phill Twiss

Here is the script in its entirety

#!/bin/bash
# this bit runs thru the revproxy entries

BASELOG="/var/log/httpd"
BASEWEB="/var/www/usage"

#First lets get the listing of all files in the directory
SITELIST=`ls -t1r $BASELOG/revproxy|grep access|cut -f1 -d"_"`

WORKLIST=""
for TESTSITE in $SITELIST; do
    EXISTFLAG=0
    for WORKSITE in $WORKLIST; do
        if [ $WORKSITE = $TESTSITE ]
        then
            EXISTFLAG=1
        fi
    done
    if [ $EXISTFLAG = 0 ]
    then
        WORKLIST=$WORKLIST" "$TESTSITE
    fi
done

#We now have a list of all the httpd instances we wish to look at, now we need to get the filelists for said domains and analyize them
for WORKSITE in $WORKLIST; do
    LOGLIST=`ls -t1r $BASELOG/revproxy/$WORKSITE*`
    `mkdir $BASEWEB/revproxy/$WORKSITE`
    for THISLOG in $LOGLIST; do
        `webalizer -Q -t $WORKSITE -o $BASEWEB/revproxy/$WORKSITE  -c /etc/webalizer.apps.conf $THISLOG`
    done
done





More information about the plug mailing list