[plug] Bash scripting suggestions required :}

Adrian Chadd adrian at creative.net.au
Wed Sep 10 13:44:02 WST 2008


You're trying to generate a unique list of site names, correct?

There are a few hacks you can do in shell.

I suggest sort/uniq in this case.

You could use a temp file:

rm /tmp/foo
for TESTSITE in $SITELIST; do
	echo $TESTSITE >> /tmp/foo
end
cat /tmp/foo | sort | uniq > /tmp/foo2

You could do that all inline; ie

root at mirror:/opt/squid-2.7/local# ls /var/log/apache2/*access.log*
/var/log/apache2/access.log                          /var/log/apache2/mirror.waia.asn.au-access.log.3.gz  /var/log/apache2/waixgh.waia.asn.au-access.log
/var/log/apache2/access.log.1                        /var/log/apache2/mirror.waia.asn.au-access.log.4.gz  /var/log/apache2/waixgh.waia.asn.au-access.log.1
/var/log/apache2/access.log.2.gz                     /var/log/apache2/mirror.waia.asn.au-access.log.5.gz  /var/log/apache2/waixgh.waia.asn.au-access.log.2.gz
/var/log/apache2/access.log.3.gz                     /var/log/apache2/mirror.waia.asn.au-access.log.6.gz  /var/log/apache2/waixgh.waia.asn.au-access.log.3.gz
/var/log/apache2/access.log.4.gz                     /var/log/apache2/sf-access.log                       /var/log/apache2/waixgh.waia.asn.au-access.log.4.gz
/var/log/apache2/mirror.waia.asn.au-access.log       /var/log/apache2/sf-access.log.1                     /var/log/apache2/waixgh.waia.asn.au-access.log.5.gz
/var/log/apache2/mirror.waia.asn.au-access.log.1     /var/log/apache2/sf-access.log.2.gz                  /var/log/apache2/waixgh.waia.asn.au-access.log.6.gz
/var/log/apache2/mirror.waia.asn.au-access.log.2.gz  /var/log/apache2/sf-access.log.3.gz

root at mirror:/opt/squid-2.7/local# for i in `ls /var/log/apache2 | grep -- '-access.log' | sed 's at -access.log.*@@' | sort | uniq`; do echo $i; done
mirror.waia.asn.au
sf
waixgh.waia.asn.au

That should be good enough for what you're doing.

HTH,

(Props for writing BASH scripts using #!/bin/bash and -not - #!/bin/sh! Good!)



Adrian

On Wed, Sep 10, 2008, Phillip Twiss wrote:
> G'Day All
> 
> 	I've got a quick and probably really dumb question someone may be able to help me with..
> 
> Background
> 
> 	We have a whole series of httpd log files that we need to feed to webalizer, they all live in the same directory but have different sitename at the front of their log file names ( e.g. www.det.wa.edu.au_access.log, www.det.wa.edu.au_access.log.1 etc etc, note the underscore )
> 
> 	I have developed a quick and dirty bash script that locates unique instances of the server name ( i.e. www.det.wa.edu.au ), then processes that list one site at a time getting all its historical files and processing them in the correct order.
> 
> My Question  -  
> 
> 	The initial loop that gets all the unique instances is very slow, I am sure there is much simpler/faster/more effective ways of achieving this loop.  The performance problem seems to come with searching the growing list for previous instances.
> 
> SITELIST=`ls -t1r $BASELOG/revproxy|grep access|cut -f1 -d"_"`
> WORKLIST=""
> for TESTSITE in $SITELIST; do
>     EXISTFLAG=0
>     for WORKSITE in $WORKLIST; do
>         if [ $WORKSITE = $TESTSITE ]
>         then
>             EXISTFLAG=1
>         fi
>     done
>     if [ $EXISTFLAG = 0 ]
>     then
>         WORKLIST=$WORKLIST" "$TESTSITE
>     fi
> done
> 
> 
> 	I am hoping someone in the list with more current bash skills than me may see something and give me some performance hints ( i.e. why don't you just use !<> or something )
> 
> 	Any and all advise welcomed :}
> 
> 	Reproduced below is the script in its entirety, feel free to use it however you will :}
> 
> 	Regards
> 
> 	Phill Twiss
> 
> Here is the script in its entirety
> 
> #!/bin/bash
> # this bit runs thru the revproxy entries
> 
> BASELOG="/var/log/httpd"
> BASEWEB="/var/www/usage"
> 
> #First lets get the listing of all files in the directory
> SITELIST=`ls -t1r $BASELOG/revproxy|grep access|cut -f1 -d"_"`
> 
> WORKLIST=""
> for TESTSITE in $SITELIST; do
>     EXISTFLAG=0
>     for WORKSITE in $WORKLIST; do
>         if [ $WORKSITE = $TESTSITE ]
>         then
>             EXISTFLAG=1
>         fi
>     done
>     if [ $EXISTFLAG = 0 ]
>     then
>         WORKLIST=$WORKLIST" "$TESTSITE
>     fi
> done
> 
> #We now have a list of all the httpd instances we wish to look at, now we need to get the filelists for said domains and analyize them
> for WORKSITE in $WORKLIST; do
>     LOGLIST=`ls -t1r $BASELOG/revproxy/$WORKSITE*`
>     `mkdir $BASEWEB/revproxy/$WORKSITE`
>     for THISLOG in $LOGLIST; do
>         `webalizer -Q -t $WORKSITE -o $BASEWEB/revproxy/$WORKSITE  -c /etc/webalizer.apps.conf $THISLOG`
>     done
> done
> 
> 
> _______________________________________________
> PLUG discussion list: plug at plug.org.au
> http://www.plug.org.au/mailman/listinfo/plug
> Committee e-mail: committee at plug.linux.org.au

-- 
- Xenion - http://www.xenion.com.au/ - VPS Hosting - Commercial Squid Support -
- $25/pm entry-level VPSes w/ capped bandwidth charges available in WA -



More information about the plug mailing list