[plug] Weird Debian Testing problem

Craig Ringer craig at postnewspapers.com.au
Sun Jun 1 16:41:10 WST 2003


> I've not ever seen any behaviour like this, and I know that I'm only
> scratching the surface of what is broken underneath, but as I said, I'm
> hoping that some of you could start asking silly questions.

OK. First: are things OK on the console? Can you "apt-get install mutt" 
and make sure you have working email on the console - might make things 
easier later. If things are working ok on the console, consider the 
posssiblity of breakage in your X environment. Try a different window 
manager, ideally something very simple like twm (*uggh*).

Most likely this will be it - breakage in your X environment. Especially 
if you're using something "fragile" like GNOME or KDE, where a lot of 
different things have to play well together for it to work properly.

I suggest that you do a "ps aux | less" and look over the output, 
keeping an eye out for zombie or "hard" sleeping processes. A zombie 
will look like this:

craig     5859  0.0  0.2  7168 2236 pts/2    Z    13:53   0:00 bash

and a process in uninterruptable sleep:

craig     5859  0.0  0.2  7168 2236 pts/2    D    13:53   0:00 bash

If you're seeing /lots/ of these, you might be facing driver or hardware 
problems, particularly disk failures. Note that a few processes in D 
state is pretty normal.

To check the condition of the disk, try running "smartctl -a /dev/hdx" 
where hdx is your main HDD (repeat for all HDDs in the system). Look for 
logged errors at the end of the output, bad sectors, high ATA error 
counts or UDMA error counts, etc. If it reports that S.M.A.R.T is not 
enabled, try "smartctl -e /dev/hdx" then retry the -a query. If it still 
doesn't work - you probably have old drives or an old BIOS, and won't be 
able to use the disk's self diagnostics.

Also, try doing an "strace" on a process to see what's holding it up. 
Just run "strace programname arguments" where you'd normally run 
"programname arguments". It can be useful to do something like
	strace 2>&1 xterm | tee /tmp/trace
so you can see what's going on and log it for later processing as well. 
It looks like gibberish, but I've found it an invaluable debugging tool 
in determining what's going wrong with an app, and where. Strace doesn't 
work properly on multithreaded apps like mozilla and openoffice.

'luck

Craig



More information about the plug mailing list