At my office tonight, we have a hung-up blade server. I think the last time this happened and we checked in the past, it was a process that had consume 100% of the processor and therefore it won't respond to anything.
This is SLES9 Linux (non-GUI mode). The video and keyboard appear unresponsive and so we're going to reboot now.
Is there a magic keystroke, or perhaps something I can install, which will permit me to kill runaway processes or regain control of the server somehow without being forced to reboot it?
__________________
When in doubt, follow the penguins.












One program I like to use for this is the bettered version of top which has nice menus for sending terminate, kill and other signals to any processes.
It's called htop. Maybe that helps.
Edit: Oh, you can use F9 key to bring up that menu with signals.
Libervis.com | Discover machinima
Login or register to post comments 1 point
Hmm I just realized you said keyboard is unresponsive, so the above wont help. Hmm.. I'm afraid I don't have any other ideas right now..
Edit: Except maybe if you could ssh to it from another box of course.
Libervis.com | Discover machinima
Login or register to post comments -1 points
No, ssh won't work, that actually takes more cpu cycles than direct keyboard input.
The problem probably isn't just a busy cpu, it's likely also swapping like madness...
If a process often uses 100% of the cpu, you could use nice to make sure other processes get priority. I don't think that will help against grinding to a halt swapping, though.
You might want to use a watchdog card.
CAN I HAS FIXD CAPSLOK KEE PLZ?
Login or register to post comments 1 point
I don't see a problem in rebooting a server that is so over-used it's useless anyway. Rebooting takes max. 2 minutes or so...
Thomas Jollans
GnuPG key: 1024D/A6B5 9461 B60F 2C80 2399 6B1E 2698 A70E F421 434B
Login or register to post comments 0 points
I don't see a problem in rebooting a server that is so over-used it's useless anyway. Rebooting takes max. 2 minutes or so...
Yeah, but it resets the uptime which has statistical implications.
Libervis.com | Discover machinima
Login or register to post comments 1 point
Yeah, I need bragging rights for uptime. For some odd reason I still have to fend off Windows advocates for projects in the server room.
It was also a pain because I had to send in one of my employees who reports to me (instead of me, who was on-call that night). This was because I had just driven home an hour away when all this went down. He had to get up out of bed to reboot it. Now, they make these remote rebooters, but unfortunately some moron here decided to purchase blade servers, which I detest terribly. Most of our other servers, and all new servers, are in pizza box format.
I guess the Linux kernel developers (and Linus) don't think it's that important to put a "magic keystroke" in the design so that, no matter what, you could type this keystroke on a severely overloaded server and get to some kind of shell prompt.
When in doubt, follow the penguins.
Login or register to post comments -1 points
Maybe it can be done with kdb, it does respond to a magic keystroke (pause/break key IIRC). I don't know if it can be used to kill processes that have gone bad.
There's also the "SysRq key" which (it seems) can not be used to kill one specific process, but it can be used to kill all processes, sync, unmount and reboot (and some other things too). So if you enable that, you can tell your minion to type some short code and at least you will have a reasonably clean reboot...
CAN I HAS FIXD CAPSLOK KEE PLZ?
Login or register to post comments 0 points