14
When a server crashes, Mongrel (or a Mongrel cluster) obviously doesn’t get a chance to shut down cleanly. This means it leaves behind the files it uses to store its process IDs. When the server restarts, the Mongrel startup script attempts to start the daemon(s), but on finding these PID files are already present, assumes (incorrectly) that Mongrel is already running, and cancels startup, saying, “PID file log/mongrel.pid already exists. Mongrel could be running already. Check your log/mongrel.log for errors.”
This is technically correct behavior–after all, what if Mongrel really is already running?–but it makes it nearly impossible to bring Mongrel back automatically after a server crash; one would have to manually delete the PID file(s) and then start the daemon(s).
If you don’t have systems administrators tending your websites 24/7, you need a better solution. We considered hacking the init script (found at /etc/rc.d/init.d/mongrel_cluster on our Fedora Core server) but found that the necessary logic made the script too complicated. Instead, we created a new startup script, filed at /etc/rc.d/init.d/mongrel_cleanup to solve the problem.
The mongrel_cleanup script is set to run at the same run-levels as mongrel_cluster. On shutdown or restart, it does nothing, but on start, it checks for the presence of the PID files and deletes them if they’re found. It therefore has to run before mongrel_cluster, which is why the priority number is 84 for startup and 16 for shutdown: mongrel_cluster is 85 and 15.
To use this script, save it in /etc/rc.d/init.d/mongrel_cleanup (or whatever the appropriate script directory is) and then put it in the startup queue with these commands:
# chkconfig --add mongrel_cleanup
# chkconfig --level 345 mongrel_cleanup on
Also, edit this script. I’ve hardwired the paths and names of our Mongrel cluster PID files; you will want to change your paths, or let me know if you come up with a more elegant method!
#!/bin/bash
#
# Parker Morse for Common Media, Inc., 9 November, 2007
#
# mongrel_cleanup Startup script to recover from crashes.
#
# chkconfig: - 84 16
# description: A hack to clear PID files left behind by Mongrel clusters
# after an unscheduled server crash. Checks for the presence
# of these files and deletes them if found.
#
RETVAL=0
PIDFILE_DIR=/path/to/app/current/log
# Gracefully exit if the controller is missing.
#which mongrel_cluster_ctl >/dev/null || exit 0
# Go no further if config directory is missing.
#[ -d "$CONF_DIR" ] || exit 0
case "$1" in
start)
if test -s $PIDFILE_DIR/mongrel.8000.pid
then
/bin/rm $PIDFILE_DIR/mongrel.8000.pid;
fi
if test -s $PIDFILE_DIR/mongrel.8001.pid
then
/bin/rm $PIDFILE_DIR/mongrel.8001.pid;
fi
if test -s $PIDFILE_DIR/mongrel.8002.pid
then
/bin/rm $PIDFILE_DIR/mongrel.8002.pid;
fi
if test -s $PIDFILE_DIR/mongrel.8003.pid
then
/bin/rm $PIDFILE_DIR/mongrel.8003.pid;
fi
RETVAL=$?
;;
stop)
exit 0
;;
restart)
exit 0
;;
*)
echo "Usage: mongrel_crash_cleanup {start|stop|restart}"
exit 1
;;
esac
exit $RETVAL
filed under: Ruby on Rails, System Administration | permalink
2 Responses to “Bringing Mongrel back from a server crash”
-
The Common Kitchen Blog » Server crash Says:
November 14th, 2007 at 11:09 am[...] but the technical details are really outside the scope of this blog, so we’ll explain them in a more technical post elsewhere. Nov 09 2007 02:46 pm | CommonKitchen.com [...]
-
Common Media, Inc. » Blog Archive » More elegant Mongrel restarts Says:
November 15th, 2007 at 3:16 pm[...] Common Media, Inc. Online Communities and Development Services « Bringing Mongrel back from a server crash [...]