Archive for the ‘System Administration’ Category

Optimizing Apache, MySQL and PHP for content management

Tuesday, November 2nd, 2010

We don’t focus on LAMP packages like WordPress or Drupal, but sometimes we pick them up in order to work with particular design partners. While the packages are generally optimized to allow deployment on shared hosting, factors from client preference to site traffic often drive our installations to VPS hosting, where the software infrastructure is not necessarily tuned to handle them. We’ve assembled a suite of tools to help us optimize the LAMP stack on a VPS.

These scripts and steps were developed tuning an Ubuntu 10.4 LTS (“lucid”) system installed at SliceHost. Generally there are two concerns: memory use (because these tools will use as much memory as they’re given, and MySQL will often throw a tantrum if it doesn’t get what it wants) and site speed.

Apache

The default Apache install slurps quite a lot of memory. The main culprit is the MPM configuration, which is where Apache decides how many instances to keep active and how much work they should each do before they shut down: basically controlling Apache’s process spawn rate. Default installations often have too many workers open by default, and spawn more too easily, thus using more memory than is really needed.

The block in /etc/apache2/apache2.conf where this is configured looks like this:

<IfModule mpm_prefork_module>
    StartServers          1
    MinSpareServers       1
    MaxSpareServers       2
    MaxClients           20
    MaxRequestsPerChild   4000
</IfModule>

The default numbers are generally much higher; this block has been tuned. However, don’t swallow these numbers either; instead, figure your own. MinSpareServers is how many processes are sitting idle; a higher number may improve site response time, but too high a number will let Apache use more memory than it needs. MaxSpareServers caps this number.

MaxClients tells Apache when to give up because it’s about to run out of memory. To figure a good setting for this, we use a script from the great folks at RimuHosting:


#!/bin/bash
echo "This is intended as a guideline only!"
if [ -e /etc/debian_version ]; then
    APACHE="apache2"
elif [ -e /etc/redhat-release ]; then
    APACHE="httpd"
fi
RSS=`ps -aylC $APACHE |grep "$APACHE" |awk '{print $8'} |sort -n |tail -n 1`
RSS=`expr $RSS / 1024`
echo "Stopping $APACHE to calculate free memory"
/etc/init.d/$APACHE stop &> /dev/null
MEM=`free -m |head -n 2 |tail -n 1 |awk '{free=($4); print free}'`
echo "Starting $APACHE again"
/etc/init.d/$APACHE start
echo "MaxClients should be around" `expr $MEM / $RSS`

MySQL

The best tool for tuning MySQL is the MySQL Performance Primer script. Download that to your home directory on the server being tuned and chmod 755 tuning-primer.sh to make it executable. You should also run which bc to see if the bc calculation language is installed; if it isn’t, try sudo apt-get install bc to get it.

This script prefers to have at least 48 hours of continuously-running server to work with. If you let it, it will produce a my.cnf configuration file with a number of suggested directive changes, generally pointed at optimizing your memory usage and MySQL performance. It’s pretty impressive.

If you’re serious about tuning your application to get fast results from MySQL, enable the Slow Query Log while you’re at it, but don’t expect this to be too helpful with packages like WordPress or Drupal, which don’t really let you get at the query builders directly.

PHP

The best thing to speed up PHP is the APC or Alternative PHP Cache. This caches PHP’s “opcodes”, the compiled versions of the PHP scripts, so they won’t be recompiled with each request.

This is a good primer for the installation on Ubuntu. One catch is that the initial APC install (the sudo pecl install apc part) often throws an error. If it fails, try sudo pecl install apc-beta and see if that works better.

Definitely install the cache status page somewhere when you’re done. It provides some interesting data about how the cache is speeding things up.

Nginx configuration tweaks for client-side speed optimization

Thursday, May 6th, 2010

As we move more sites to an Nginx/Passenger server stack, we need to translate the server-side optimizations we use for browser-side performance from Apache configuration to Nginx. Here’s how that comes over:

Gzip compression

In Nginx, this is an optional module which should be built in to the software at compile time. If it’s present, you can activate it in site configuration with a block like this:

 gzip on;
 gzip_min_length  1100;
 gzip_buffers     4       8k;
 gzip_proxied     any;
 gzip_types       text/plain text/css application/x-javascript text/xml application/xml application/xml+rss text/javascript;

Note that Nginx considers text/xml and text/html to be redundant MIME types.

Long expiration dates

We want to have our assets cached in the user’s browser cache as long as possible, with a few exceptions (e.g. advertising assets.) To that end, we set expiration headers:

       location ~* \.(js|css|jpg|jpeg|gif|png)$ {
                if (-f $request_filename) {
                   expires                        max;
                   break;
                }
       }

This uses the filename extension to determine which files get “far future” expiration dates. If you’re used to naming files without extensions, this might not work so well for you.

By default, Nginx doesn’t use Etags, so we can stop worrying about that.

Surviving Django traffic spikes with Nginx and memcached

Thursday, May 6th, 2010

Normally I wouldn’t categorize a multi-national organization with a seven-figure annual budget as a “small” client, but we didn’t build the site for the World Marathon Majors, a sort of mass-marathon trade group made up of the Boston, London, Berlin, Chicago and New York City marathons. Instead, we inherited their Django-based site from a previous partner, and our job is to keep it running, fix what breaks, and make modest upgrades.

The site’s traffic is flat for most of the year, but race day for any of the five events produces a spike in the traffic graph resembling a shark fin in a calm sea. Last month, with both Boston and London on the horizon, we took steps to ensure the site and its new hosting could handle the load.

Because the site does not ask users to register, we recognized that we could cache complete HTML pages as they were rendered by Django and expire them after an arbitrary time span. If we kept rendered pages in the cache for ten minutes, for example, no page would be rendered more frequently than once every ten minutes, and yet news updates would never be delayed by more than ten minutes waiting for a cache refresh.

Because we were already using Nginx as the front-end of our server stack, we followed this stellar write-up by Alex Holt describing a simple system where Django writes to the cache, and Nginx reads from it. By using memcached for our cache rather than writing to disk or a database, we realized a little extra benefit: whenever Nginx has a cache hit, the page is returned significantly faster than if there’s a cache miss and Django has to render the page. This isn’t surprising when you think about the whole stack–of course a simple read will always be faster than a dynamic page build with multiple database queries–but gave us a nice Zen server moment. By doing less work, the server was getting more done.

And the whole stack survived London with flying colors. Now we just have to figure out how to unwind the site’s designed-in assumption that UK is a legitimate language code for English.  (Hint: People who speak “UK” use the Cyrillic alphabet.)

Rejecting mail for valid local users

Friday, February 19th, 2010

Several months ago we mentioned that certain Linux distributions will, if they are running an SMTP server and accepting mail for local users, accumulate spam for “role” users who will never read their mail, e.g. mail, uucp, news, etc.

I finished by suggesting,

The best shortcut here is to bounce any email destined for the role users. This will vary depending on your MTA, so I won’t detail it here.

It’s true that it varies by MTA, but it turns out it’s really hard to find this information. MTAs aren’t set up to reject mail from specific addresses; they want a specific list of valid addresses and they’ll reject everything else.

It turns out that there’s a faster and cleaner method which is MTA-independent: lock the mailboxes for those users.

This could be as simple as putting an empty file in /var/spool/mail/uucp (for example) which is owned by root with 600 permissions, but that’s going to generate a bunch of error messages when your local delivery agent tries to write to a file it doesn’t have permissions for. A more elegant solution is to symlink those paths to /dev/null:

ln -s /dev/null /var/spool/mail/uucp

Now make sure the symlink has the correct ownership…

chown -h uucp:mail /var/spool/mail/uucp

Now all that spam will be silently delivered to the bit-bucket.

Clean out role-user mailboxes

Monday, September 14th, 2009

I suppose the subtitle to this post could be, “why you don’t want to be a mail administrator.” The fact is, spam has made running a host which accepts email a pretty unpleasant task all around, and we strongly suggest clients we host domains for run email for that domain through Google Apps rather than relying on POP/IMAP through our host.

We do still have a box which accepts incoming SMTP, though, and that means putting up with a certain amount of unsolicited commercial overhead. Every Linux box has a certain number of no-shell users set up to run daemons with limited privilege, e.g. the apache user. Oddly, some spammers either run the same username scans as the brute-force ssh hackers, or they think the apache user is actually reading its email. I found 44MB of unread email, about 5,000 messages, waiting in that inbox. uucp had the next biggest collection; mail and news were right up there as well.

That’s a lot of disk space. After a cursory glance to ensure there was nothing actually important in there, I simply used sudo cp /dev/null /var/spool/mail/apache to take out the trash.

If you are root or have appropriate sudo privileges, you can check another user’s mail using mutt or a similar command-line mail client. Just use the -f flag to feed mutt a mailbox path as an argument. In this case, I could check the apache mailbox using sudo mutt -f /var/spool/mail/apache . Naturally, you would want to have a talk with your company ethicist before doing this on a mailbox belonging to an actual user.

The best shortcut here is to bounce any email destined for the role users. This will vary depending on your MTA, so I won’t detail it here.

Moving a Rails/Mongrel site to Phusion Passenger on Ubuntu

Sunday, March 15th, 2009

As I’ve noted before, several of our Rails projects are hosted using Phusion Passenger. We’ve been helping HitFix explore this option in recent weeks because it solves several minor deployment and restart frustrations they’ve been grappling with, and in the process I wrote up this walk-through for the installation.

This process is actually really easy and the instructions on the Passenger site and those provided during the installation process are excellent. The glitch we ran in to is that recent versions of Ubuntu and/or Apache 2.2.x use a slightly different method of loading modules and module configuration. (I haven’t run in to this method often enough to have learned whether it’s new in Apache or in Ubuntu.)

Instead of using one monolithic configuration file, sometimes offloading host configuration to included files, these Apaches instead globally include all modules found in the mods_enabled subdirectory of the Apache directory. Each module has two files, a .load file and a .conf file. The .load file contains only the LoadModule configuration directive; the .conf file contains any configuration directives.

Now, the actual files live in the mods_available subdirectory of Apache, and activating a module just means adding a symlink to the relevant files in mods_enabled, then restarting Apache. Deactivating it just means removing the symlink and restarting again.

With that in mind, here’s the Passenger process:

  • Install Passenger on the server. This should be as simple as:

sudo gem install passenger
sudo passenger-install-apache2-module

  • On one system, this second command actually told me that the Apache 2 Development Headers were missing. (Passenger needs these to compile the Apache module.) I needed to run sudo apt-get install apache2-prefork-dev to get these headers. This didn’t work on the first attempt, but running sudo apt-get update once or twice finally got me a successful install of that package. Running the passenger-install-apache2-module script then worked successfully. Note that you will need a compiler on the system in order to generate the Apache module. You would think you could take this for granted, but apparently not.

At this point, the module is built. This is where we have to adapt Phusion’s original installation instructions. Essentially, the next step is to make sure the module is loaded by Apache. Here’s how I did this:

  • Create a file at /etc/apache2/mods-available/passenger.load with the following content:

LoadModule passenger_module /usr/lib/ruby/gems/1.8/gems/passenger-2.0.6/ext/apache2/mod_passenger.so

  • Create a file at /etc/apache2/mods-available/passenger.conf with the following content (I think the IfModule container may be optional, but I played safe):

<IfModule passenger_module>
PassengerRoot /usr/lib/ruby/gems/1.8/gems/passenger-2.0.6
PassengerRuby /usr/bin/ruby1.8
</IfModule>

  • Create symbolic links to both those files from /etc/apache2/mods-enabled/, i.e.

sudo ln -s /etc/apache2/mods-available/passenger.load /etc/apache2/mods-enabled/
sudo ln -s /etc/apache2/mods-available/passenger.conf /etc/apache2/mods-enabled/

We can restart Apache at this point without disturbing the existing site. Note that we can change which ruby is used (e.g. for Ruby Enterprise Edition) by altering the path at the PassengerRuby directive and restarting Apache.

To move a site to Passenger, you want to back up the file found at /etc/apache2/sites-available/yoursite, then edit it to make these changes:

  • Change the DocumentRoot to the public directory of the app, e.g. DocumentRoot /u/apps/yoursite/current/public
  • Comment out (with a #) all the lines related to the proxy, starting with the Proxy balancer definition and extending through all the ProxyPass, ProxyPassReverse and ProxyPreserveHost directives.
  • Restart Apache (/etc/init.d/apache2 restart). If something goes wrong, restore the backup you made of the configuration file and restart Apache again to restore the previous configuration.

The final trick is to update the restart task in Capistrano to handle restarting Passenger and not Mongrel. This is actually pretty easy, particularly if you’ve already overloaded the restart task in your config/deploy.rb file. Just replace the line reading run "mongrel_rails cluster::restart -C #{mongrel_config}" with one reading run "touch #{current_path}/tmp/restart.txt" and we’re good to go. The start and stop tasks probably also needs tweaking to remove the mongrel commands–probably pasting the same command in for “start” will do the trick unless we feel like putting Apache commands in.

Phusion Passenger, permissions, and Rails’ session store

Tuesday, November 25th, 2008

For various reasons, we’re moving three Rails apps hosted on one VPS to use Phusion Passenger rather than the Apache/Mongrel Cluster stack (via Coda Hale) we’ve been using since we got started.

The Passenger installation process really is as easy as it’s been described, particularly if you’re as comfortable with Apache configuration as we are, but by moving responsibility for serving your apps from Mongrel to Apache, file and process ownership becomes an issue in a way it wasn’t when Mongrel and Apache talked to each other and otherwise minded their own stores. Ben Hughes hinted at this in his walk-through of the Passenger installation, and we were not immune.

Our problem turned out to be session files. The first application we moved to Passenger (and it’s worth noting that apps running on Passenger and apps running Mongrel cluster can coexist under the same Apache) was Redmine, which uses file-based session store by default. If ownership of those session files creates a problem, and it did for us, switching to ActiveRecord session storage resolves the problem.

For those wondering about the point of Passenger, it’s memory use. A single Mongrel instance and an app running under Passenger have comparable RAM requirements, but because Passenger is an Apache module, it can spawn new instances as requests arrive; the Mongrel cluster solution requires multiple mongrel instances to be running all the time. This means Mongrel apps essentially require the RAM for peak capacity all the time. Passenger gives that RAM back at non-peak times, which allows more apps to share the same system, allowing us to get more mileage from our VPSes. (It’s worth noting that the Phusion guys claim a 33% memory savings per instance if Ruby Enterprise Edition is also used, so that will probably be our next step when all the apps are Passenger-ized.)

Update, 12/10: Another gotcha we found was file permissions. Again because of the permissions issues, Rails wasn’t writing to the shared/logs/production.log file. We didn’t notice this immediately, because the applications ran, they just didn’t log anything. The solution here was to change the ownership and permissions on the files. Specifically, we changed group ownership to “nobody” (the group Apache runs under) and the mode of the log file to 664.

Solutions for Ruby vulnerabilities

Monday, June 23rd, 2008

The recently-announced vulnerabilities in Ruby have put many of us administering Rails applications in a production space between a rock and a hard place.

To recap, there are three major “lines” of Ruby interpreters, the 1.8.6 line, the 1.8.7 line, and the 1.9 line. All of these show the vulnerability, so nearly everyone needs to update their Ruby. (I’ll get to the exceptions later.) Rails introduces a new line of complication, because the 1.8.7 only works for Rails 2.1 and newer; if the Rails apps in question aren’t ready for Rails 2.1, or already running it (unlikely), the best route would be to continue to update in the 1.8.6 line. So there’s the rock: we need to update our Ruby.

The “hard place” is that the only patches so far released by the Ruby maintainers tend to produce segmentation faults, according to many who have tried them so far. That’s a long way of saying, “They don’t work.”

As a Rails-supporting sysadmin with, you’re left with several options, none of them comfortable. You can, in order of increasing riskiness:

  • Wait out the Ruby maintainers and hope they release a working 1.8.6 patchlevel before your app is compromised.
  • Do crash-priority Rails upgrades on all your apps and switch to Rails 2.1, then move your Ruby to the 1.8.7 line.
  • Switch to an entirely different Ruby interpreter, like Rubinius or JRuby. (This hinges on the idea that Ruby is an interpreted language, and as long as the language is interpreted consistently there is room for any number of interpreters. Apparently JRuby and Rubinius are not affected by these vulnerabilities, only MRI, Matz’s Ruby Interpreter, which is the “standard” implementation.)
  • Switch to a different production stack entirely. For example, we’re currently using Apache 2.2 as a proxy for a Mongrel cluster in front of Rails 2.0.2 apps, a not-uncommon configuration. This could be the pressure we’d need to jump to Phusion Passenger, given that their Ruby Enterprise Edition claims to have applied the security patches to MRE 1.8.6 p111, and thus doesn’t introduce the segfault-inducing breaking features that the most-recent patchlevel does.

I don’t know which way we’re going yet, but I’m not interested in waiting too long, and the upgrade to Rails 2.1 is going to happen sometime anyway, so Option #2 seems most likely for us.

Update: Hongli Lai from Phusion assures us (in the comments) that Ruby Enterprise Edition can be used as a drop-in replacement for MRE without replacing the entire stack, moving REE up to the front of the line in the “different interpreter” option. Discussion seems to suggest to me that new official patches from the Ruby maintainers will not be coming in a timely fashion, but we are beginning to see “contributed” patched distributions emerge for e.g. FreeBSD and Debian.

Rails asset hosts, SCM, and bundling

Wednesday, April 9th, 2008

This afternoon we pushed another big revision to the La Cucina Italiana website, which is now sharing a very small fraction of the thousands of recipes that magazine has in its archives. There will be more recipes coming online in the months to come, but most of the puzzles we had to solve are finished now.

One in particular was handling the photos which go with some recipes. Artful photography is a hallmark of the La Cucina Italiana brand, and the editorial team needed to be able to upload their images directly to the site. These photos wouldn’t be stored in the site database, but because they wouldn’t be part of our Subversion repository for the site, either, they had to live outside the normal site root in order to avoid being blown away by any site updates we deployed.

Enter Rails’ asset hosts. Rails allows for assets (e.g. images, CSS files, or Javascript includes) to be served by a host other than that of the main site. Because the asset host definition happens at the host name level, the asset host can actually be on the same box as the main site, or it can be elsewhere in the world; you manage that at a different level of abstraction.

A “free” benefit of asset hosts is that by defining multiple asset hosts (a0 through an), you can fool a user’s browser into downloading your site through a dozen or more different connections, rather than just the two it limits itself two with any single server. Rails will make each asset link use a different asset host, and the browser will open two connections to each server, not caring that they all happen to be on the same box. This gets us extra YSlow points (of course, we lose them again by requiring another DNS lookup for each asset host).

Our hangup, though, was that some assets, specifically CSS and Javascripts, did need to stay inside the Subversion repository and the site’s file tree.

Here’s the solution we came up with:

  • Defined an Apache virtual server answering to subdomains a0 through a4. (In other words, all four subdomains are the same real server, but the browser doesn’t know that; we could make them different physical servers someday.)
  • Set the site root for that server to /var/www/assets/.
  • Made symbolic links from /var/www/assets/javascripts to /var/www/production/current/public/javascripts, and from /var/www/assets/stylesheets to /var/www/production/current/public/stylesheets. (N.B. if you’re using Capistrano, “current” in your site directory is itself a symbolic link to a directory with your last site deploy.)
  • Set the upload code for recipe photos to store the uploaded images in subdirectories of /var/www/assets/images.

This way, we get all the benefits of asset hosts for assets which are under revision control, and assets which aren’t. And the asset hosts themselves let us have assets which aren’t necessarily under revision control.

The free bonus here was that by using a symbolic link to the javascripts directory, bundle_fu doesn’t have to know or understand our asset host setup; it just stores its bundled files in that same subdirectory as always, and it just works.

Restarting Mongrel clusters with Capistrano 2

Saturday, December 8th, 2007

There’s (still) a glitch between the mongrel_cluster gem and Capistrano 2 (we’re using mongrel_cluster 1.0.5 and Capistrano 2.1.0, for reference) where the application restart at the end of a cap deploy fails with an error like this:

Couldn't find any pid file in '/var/www/[application]/current/tmp/pids' matching 'dispatch.[0-9]*.pid'

I’m not sure what’s causing this, but the solution comes at the end of this post, under the heading “Restart Mongrel.” Due to issues with Cap 2 and sudo, though, the provided script fails for us. We’re running the updates as root (bad idea, but it gets around the cap sudo issues) so I updated the task like this:

# Restart task

set :mongrel_config, "/etc/mongrel_cluster/#{application}.yml"

namespace :deploy do

	task :restart do
		run "mongrel_rails cluster::restart -C #{mongrel_config}"
	end

end

I also commented out the “:mongrel_conf” variable from our previous configuration, which it appears that Capistrano was ignoring anyway.