[plug] sshd help (take2)

Tue Feb 8 12:23:48 WST 2005

Thanks, Craig and Tim.

Tim: no chance yet to check the aspects you mentioned.
Craig:...

At 11:33 AM 8/02/2005, Craig Ringer wrote:
>On Mon, 2005-02-07 at 21:29 +0800, Denis Brown wrote:
> > Sigh,
> >
> > Not quite as simple as I thought!   Okay, the situation appears to be that

<snip>

>That shouldn't happen. You can have many connections from the same user,
>and when the ssh client is disconnected due to network failure the sshd
>tends to hang around for a *long* time without impairing logins from
>others.

I thought as much!   In fact many times I have ssh'd into a machine, using 
the same user credentials either from additional instances of PuTTY on one 
machine, or from several individual machines.

>It sounds to me more like the "master" sshd whose job it is to spawn an
>sshd for each new connection might be having problems.

Nod.

>ensure no sshd processes are still running (kill any that are). Then, as
>root, run:
>
>sshd -d
>
>and connect to the server. It should log (and IIRC output) a lot of
>information. Examining that may give you some clues.

Thanks.  As above to Tim's suggestions, no physical machine in front of me 
now.   But what you say gels quite nicely.   Consider that using the init.d 
scripts to do the sshd start/stop/restart produces some interesting 
effects.   Namely that if the sshd process is still "healthy" then I can 
start, stop, restart it to my heart's content.   If the "master" sshd as 
you put it becomes unhealthy though - and maybe forgets to fork off child 
processes, or kill them - then attempting a sshd stop results in red !! 
marks on the resulting output line (on the root console) and subsequent 
attempts to start or restart it brings forth the message that the sshd is 
already running.   Having a look at the sshd script (in /etc/init.d) shows 
that it is trying to identify the sshd (master) process through its entry 
in /var/run/sshd.pid, if that file in fact exists.

In the face or the error messages mentioned above, a ps -A shows no such 
sshd process running and /var/run does not contain an shd.pid file 
either!   Somehow or other the process "engine" has gone off the rails at 
least as far as sshd is concerned.   The init.d script in fact uses the 
start-stop-daemon which, according to text strings within it hails from 
Debian so I'm a little loathe to blame it :-)

It was on the basis of the observed bizarre behavior with the "already 
running" but invisible sshd that I unemerged and re-emerged sshd last thing 
last night.  But to no avail.   I'll throw this one at the Gentoo forum for 
comment.   Perhaps I should add that this is a fresh installation and I am 
reasonably sure that I have a valid kernel - no silly mistakes in config 
and that otherwise everything else on the system seems fine.

At this point I do not know if I am facing two probems or just one... if 
one then the misbehaving master sshd would be the problem.   If two then 
I'd pitch for some PAM-related mischief or misconfiguration.   Driving home 
it occurred to me the significance of the [net] and [pam] 
sub-processes...    When an incoming conection attempt is made there would 
logically be a [priv] part, then a [net] part while network-related 
activities took place.   This would be succeeded by a [pam] part for 
authentication.   Once authentication was achieved that process (sub 
process?) could die off, then the [net] sub process, leaving only the 
[priv] and userland parts intact until either logout or connection 
dropout.   Corrections to the above most gratefully recvd!

Thanks,
Denis