[plug] handling failed non-redundant storage in a server
Craig Ringer
craig at postnewspapers.com.au
Thu Feb 12 11:56:46 WST 2004
On Thu, 2004-02-12 at 11:43, Sham Chukoury wrote:
> On Thu, 2004-02-12 at 11:09, Craig Ringer wrote:
>
> <snip>
>
> > I was wondering if there's any way to deal with this - to remove the
> > processes I know will never recover, unmount the dead volume without
> > causing any harm to other parts of the system, etc. While I'll be able
> > to reboot this evening, surely there's a way of dealing with this sort
> > of thing without a reboot?
>
> Hmmm... Cycling init levels? :)
'fraid not. It's not an issue that simple - these processes are in D
state (interruptible sleep waiting for I/O). It's not a simple problem
of a process needing restarting or a volume needing unmounting. Any
attempt to unmount the volume simply causes the umount process to block
too. Even `sync` blocks permanantly.
> Or... have you tried killing the unrecoverable processes?
> kill[all] (-9) (pid|name)
Yes. As they're waiting for I/O they can't be killed, even by a kill -9.
The reason, as I understand it, is that they're running in kernel mode
at the moment and can't be killed until they leave the kernel I/O
routines - which they never will, because the disk is no longer there.
> As to unmounting the dead volume.. find out which processes think
> they've got open files on it, using lsof, and kill those processes, then
> try unmounting.
It's not a matter of processes using the volume that's the issue. The
volume can't be umounted because the kernel can't sync the filesystem -
the device is no longer there.
To give you an idea of what I mean - if I
`dd if=/dev/sdc of=/tmp/diskstart bs=1M count=1`
I get the error:
dd: opening `/dev/sdc': No such device or address
because the device is no longer present - it's been disabled by the RAID
controller and the driver has made the kernel aware of this. Yet this
entirely absent device has a mounted filesystem and files open on that
filesystem.
> You mean something like this?
> http://www.digital-explosion.co.uk/index.php?articleID=31
Eek. No, not that well cooled - water can stay well away from my
servers. I'm talking about a 5U railmount server case with most of the
back taken up by fans - cooling by stupid amounts of airflow. As much as
I'd love a quieter cooling solution, it's just not practical.
Craig Ringer
More information about the plug
mailing list