[plug] python head+wall issue

Cameron Patrick cameron at patrick.wattle.id.au
Mon Dec 29 21:50:24 WST 2003


On Mon, Dec 29, 2003 at 08:40:40PM +0800, James Devenish wrote:

| I have noticed that the PLUG list has been "unusually" slow at delivery
| since October at least. I haven't written a script to actually calculate
| the delay -- you have 90mins to beat me to it while I eat dinner ;-)
| Perhaps spark has had many new subscribers, other lists, or website
| stuff.

Okay, since I've had nothing better to do in the last hour: the first
column is the message's Date: header, the middle one the delay between
spark receiving it from the remote server and spark receiving it from
the ML system, and the third the total delay before the message reached
me.

Excerpt of December sample:
         Mon, 29 Dec 2003 20:06:17 +0800 *        6.2 *       11.8
              29 Dec 2003 20:06:26 +0800 *        8.8 *       13.3
              29 Dec 2003 20:11:15 +0800 *        7.0 *       10.5
         Mon, 29 Dec 2003 20:13:34 +0800 *        7.8 *       12.5
         Mon, 29 Dec 2003 20:15:38 +0800 *        8.6 *       12.4
         Mon, 29 Dec 2003 20:14:45 +0800 *       12.4 *       17.3

Excerpt of September sample:
         Tue, 30 Sep 2003 17:15:01 +0800 *        2.0 *        6.4
         Tue, 30 Sep 2003 17:55:02 +0800 *        2.4 *        6.8
         Tue, 30 Sep 2003 18:14:50 +0800 *        1.6 *        5.1
         Tue, 30 Sep 2003 18:14:53 +0800 *        3.6 *        7.1
         Tue, 30 Sep 2003 18:21:24 +0800 *        2.2 *        4.7
         Tue, 30 Sep 2003 19:24:03 +0800 *        2.3 *        5.7

After removing outliers (where the time was > 15 minutes):

 December mean = 2.15; sample standard deviation = 1.42 ; n = 1281
September mean = 1.47; sample standard deviation = 0.66 ; n = 756
	H_0: December = September
	H_1: December > September
P-value: 1.99e-34; so I think we can pretty safely assume that Spark's
latency is increasing, albeit gradually.

Note that the September sample is only for part of the month; it starts
when spark switched over to running mailman.

A pretty graph can be found at: http://cp.yi.org/cameron/sparkdelay.png

The X axis scale is message number (since 13 Sep 2003); the Y axis shows
the number of minutes' delay.  The interesting bit is that the delay
looks like it increases and then drops back at the end of each month -
perhaps it's archive-related?

The python script which generated all this was:

	#! /usr/bin/python
	
	from rfc822 import Message, parsedate_tz, mktime_tz
	import sys
	
	m = Message(file(sys.argv[1], 'r'))
	time1 = None
	time2 = None
	time3 = None
	for rcvd in m.getheaders('Received'):
	        if 'by erdos' in rcvd and time3 == None:
	                time3 = mktime_tz(parsedate_tz(rcvd.split(';')[-1]))
	        if 'by spark.plug' in rcvd:
	                if 'from spark.plug' in rcvd:
	                        time2 =	mktime_tz(parsedate_tz(rcvd.split(';')[-1]))
	                else:   
	                        time1 = mktime_tz(parsedate_tz(rcvd.split(';')[-1]))
	
	if time3 != None and time2 != None and time1 != None:
	        print "%40s * %10.1f * %10.1f"%(m['date'], (time2-time1)/60, (time3-time1)/60)

The means and SD's were calculated using GNU Octave, and the graph
generated with gnuplot.  :-)

Cameron.





More information about the plug mailing list