Common Media, Inc.



Jul
15
Rails asset hosts and SSL

We’ve previously mentioned the asset host setup we created for La Cucina Italiana. For the most recent revision, which includes user registration, we had to add another layer to the mix: SSL.

The standard operating procedure for Rails applications and SSL is to create the SSL host (which functions, in Apache, as another virtual host) and proxy it directly to the same Ruby httpd you’re using to run the rest of the site, with the added request header to indicate that yes, this one came in via SSL.

The problem is that Rails then generates a page in which all the sub-requests use SSL. This is a problem when you’re using asset hosts, unless you’ve either paid extra for a certificate that covers your asset host subdomains (and why this should cost extra is beyond me, by the way, but nearly nobody does it) or are willing to force Rails to not securely link page assets, which leads to “mixed content” warnings from many browsers (a secure page including insecure components.)

The solution to this problem is in the Rails documentation where the asset hosts are described in the first place: you generate the asset hosts using a proc. There are two gotchas here, though: First, you need to be using Rails 2.1.0 or better. (This, combined with the Ruby problems that came up last month, pushed La Cucina to Rails 2.1.) The example in the documentation, however, only allows for one asset host, and the %d solution for randomizing across four hosts doesn’t work within a proc. Here’s how to manage it, if you weren’t able to figure that out from the documentation:

ActionController::Base.asset_host = Proc.new { |source, request|
  if request.ssl?
    "#{request.protocol}#{request.host_with_port}"
  else
    "#{request.protocol}a#{rand(4)}.lacucinaitalianamagazine.com"
  end
}

Essentially, this turns off asset hosting for requests using SSL, which means those requests will come through the application itself. Fortunately for us, all the SSL-requiring pages don’t use content which isn’t under source control!

filed under: Ruby on Rails | comments (0) | read more...

Jun
23
Solutions for Ruby vulnerabilities

The recently-announced vulnerabilities in Ruby have put many of us administering Rails applications in a production space between a rock and a hard place.

To recap, there are three major “lines” of Ruby interpreters, the 1.8.6 line, the 1.8.7 line, and the 1.9 line. All of these show the vulnerability, so nearly everyone needs to update their Ruby. (I’ll get to the exceptions later.) Rails introduces a new line of complication, because the 1.8.7 only works for Rails 2.1 and newer; if the Rails apps in question aren’t ready for Rails 2.1, or already running it (unlikely), the best route would be to continue to update in the 1.8.6 line. So there’s the rock: we need to update our Ruby.

The “hard place” is that the only patches so far released by the Ruby maintainers tend to produce segmentation faults, according to many who have tried them so far. That’s a long way of saying, “They don’t work.”

As a Rails-supporting sysadmin with, you’re left with several options, none of them comfortable. You can, in order of increasing riskiness:

I don’t know which way we’re going yet, but I’m not interested in waiting too long, and the upgrade to Rails 2.1 is going to happen sometime anyway, so Option #2 seems most likely for us.

Update: Hongli Lai from Phusion assures us (in the comments) that Ruby Enterprise Edition can be used as a drop-in replacement for MRE without replacing the entire stack, moving REE up to the front of the line in the “different interpreter” option. Discussion seems to suggest to me that new official patches from the Ruby maintainers will not be coming in a timely fashion, but we are beginning to see “contributed” patched distributions emerge for e.g. FreeBSD and Debian.

filed under: Ruby on Rails, System Administration | comments (2) | read more...

Jun
17
Dealing with a “print Slashdotting”

If you read Wired, you may have heard of Tripletz.com. The Providence-based company allows site users to compose three-part messages on postcards, which are then mailed on consecutive days, a sort of postal Burma Shave arrangement.

When they appeared in Wired, the resulting spike in site traffic exposed some weaknesses in the site architecture. The Slashdot effect is generally an entirely-online situation, but the Tripletz idea was compelling enough (and the name memorable enough) that thousands of readers followed up on the print article and visited the site. Tripletz contacted us to put out the fires. We pushed the first revision yesterday, shifting a big data-sifting operation out of the Rails code (where it was taking so long the page requests would time out) back into the database engine, where it happened in milliseconds.

There’s a lot to be said for the rapid production allowed by Rails, but when it comes to high-traffic situations, it’s worthwhile to be able to look at the application and figure out where bottleneck operations should be taken away from the comfortable, hip tools of Rails and given back to the gritty old database, which almost always performs them faster. Being opinionated about code can be lead to better results in some situations, but making the application work for the client means balancing the code’s poetry with pragmatism–a balance we’re happy to take on, even if we can’t take credit for the site’s undeniably cool idea.

filed under: Business, Ruby on Rails | comments (0) | read more...

Apr
24
Giving back, in a small way

OK, a very small way.

Like many small companies doing development on the Web, Common Media depends heavily on free and open-source software. Part of the point of open source is that a programmer using the program (or library, or plug-in) may get in to the code and wrangle it around until it works best for them. The obligation that comes with that freedom is to “give back” any such changes if they may be useful to the wider community. With big projects, that can mean active participation in a coding community; for smaller packages, it may just mean sending code back to the maintainer for consideration.

We mentioned a few weeks ago how we tweaked a plug-in for Common Kitchen. Today, that code became our first checked-in contribution to an open-source project, Netphase’s acts_as_amazon_product. Hopefully it won’t be the last.

(I also took the opportunity to use a topical test case. Check the commit to see which magazine we test magazine searching with.)

filed under: Ruby on Rails | comments (0) | read more...

Apr
9
Rails asset hosts, SCM, and bundling

This afternoon we pushed another big revision to the La Cucina Italiana website, which is now sharing a very small fraction of the thousands of recipes that magazine has in its archives. There will be more recipes coming online in the months to come, but most of the puzzles we had to solve are finished now.

One in particular was handling the photos which go with some recipes. Artful photography is a hallmark of the La Cucina Italiana brand, and the editorial team needed to be able to upload their images directly to the site. These photos wouldn’t be stored in the site database, but because they wouldn’t be part of our Subversion repository for the site, either, they had to live outside the normal site root in order to avoid being blown away by any site updates we deployed.

Enter Rails’ asset hosts. Rails allows for assets (e.g. images, CSS files, or Javascript includes) to be served by a host other than that of the main site. Because the asset host definition happens at the host name level, the asset host can actually be on the same box as the main site, or it can be elsewhere in the world; you manage that at a different level of abstraction.

A “free” benefit of asset hosts is that by defining multiple asset hosts (a0 through an), you can fool a user’s browser into downloading your site through a dozen or more different connections, rather than just the two it limits itself two with any single server. Rails will make each asset link use a different asset host, and the browser will open two connections to each server, not caring that they all happen to be on the same box. This gets us extra YSlow points (of course, we lose them again by requiring another DNS lookup for each asset host).

Our hangup, though, was that some assets, specifically CSS and Javascripts, did need to stay inside the Subversion repository and the site’s file tree.

Here’s the solution we came up with:

This way, we get all the benefits of asset hosts for assets which are under revision control, and assets which aren’t. And the asset hosts themselves let us have assets which aren’t necessarily under revision control.

The free bonus here was that by using a symbolic link to the javascripts directory, bundle_fu doesn’t have to know or understand our asset host setup; it just stores its bundled files in that same subdirectory as always, and it just works.

filed under: Ruby on Rails, System Administration | comments (0) | read more...

Mar
27
Y Be Slow?

When Yahoo! released it’s YSlow application last year as a plugin for the Firebug Firefox extension (because really, what web developers don’t have Firefox installed, even if it isn’t their primary browser?) nearly everyone installed the tool and started going down the list of rules Yahoo! laid down for improving “front-end performance” on websites. Several people wrote up suggestions for using the output to improve Rails apps, including a good summary for Nginx, but we’re using Apache.

“Front-end performance” means attacking speed as a problem between the browser and the server, and not as a problem which exists solely behind the server. We can optimize an application as much as we want, but if the browser makes thirty-five round-trips to the server fetching CSS, image files, and JavaScripts, application optimization isn’t helping much. In addition to grading apps on these fourteen points, YSlow gives a load time (in milliseconds!), and as your grade improves, you can also see the load time improving by perceptible intervals.

This afternoon I ran YSlow on the current development version of the La Cucina Italiana site. The initial grade was 52, an F. When I was done, it was 88, a B, and if I circumvent a dubious aspect of YSlow, it becomes a 98 A. I made only four changes: three Apache configuration tweaks and one Rails change. If you’re a Rails developer with Apache 2.2 in your stack, here are the low-hanging fruit for a better YSlow score. You can make these changes in your site configuration or within an .htaccess file.

ETags: The Yahoo! explanation of how ETags are important is a little confusing. The summary is this: you want to turn ETags off. They’re ineffective with multiple servers (i.e. an asset-server setup) and even if everything is on one host, there are other, just as useful means of avoiding downloads of cached files. So spare yourself that many bytes per connection and turn them off. It’s a one-line fix:

FileETag none

GZip components: The time saved by sending down compressed files is greater than the time spent compressing them, but only for text-based file types. (Images and PDFs are already compressed, so re-compression won’t help and might hurt.) If you’re using Coda Hale’s configuration for Apache and Mongrel (and many of us are) the code for compressing text components before download is already in your Apache configuration. However, Coda’s config misses one file type which Rails seems to use for Javascript, application/x-javascript, so YSlow keeps dinging us for uncompressed JavaScript files. With that added, the configuration for compression looks like this:

AddOutputFilterByType DEFLATE text/html text/plain text/xml application/xml application/xhtml+xml text/javascript text/css application/x-javascript
BrowserMatch ^Mozilla/4 gzip-only-text/html
BrowserMatch ^Mozilla/4.0[678] no-gzip
BrowserMatch \bMSIE !no-gzip !gzip-only-text/html

(Note that there should be only four lines there: one AddOutputFilterByType and then three BrowserMatch lines.)

Far-future cache expiration: Yahoo! is looking for really far future expiration dates for cache expiration. The reasoning here is to maximize the odds that your users are arriving at your site with a “primed” cache, i.e. one where most of your components are already loaded. By putting a way-out expiration on components, users keep those bits in their cache longer. The flip side is that filenames need to change in order to prompt the browser to load a changed file. Rails does this by automatically appending a timestamp to the filenames of many components. (Check your page source and see what is actually getting called when you load a CSS file.) Therefore, we can set our default expires header way out in the future.

This is a two-line addition to the Apache configuration:

ExpiresActive On
ExpiresDefault "access plus 10 years"

Reducing connection count: This is the tough one. Default Rails apps load a zillion (OK, seven) Javascript files and at least one but possibly many more CSS files, and that’s before we start with page images. Most browsers will only open two or three connections at a time to a given server, so all the files are waiting in line to get loaded. You can maximize the number of concurrent connections by using an asset server; this fools the browser into opening more connections by giving it more hosts to connect to. (This is like opening more Starbucks’ in the same town.) That doesn’t really save you, though, and it boosts the number of DNS lookups the browser needs to run. To really cut the number of JS and CSS files, you need to bundle assets. It’s possible to bundle images, but that’s not a problem for LCI, which has only four or five images on the home page. Bundling JS and CSS is where the action is.

There are multiple ways to do this, including a nifty program called AssetPackager and the caching code built in to Rails 2.0. We went with what we saw as the simplest route, which was bundle_fu, but we have an eye on AssetPackager for future consideration. bundle_fu is a Rails plug-in which takes all the CSS and JS files for a template and concatenates them into one CSS and one JS file, sometimes even minifying the JS. It’s a quick installation, and not only does it convince YSlow to give your site a good grade on the number of connections, it also gives good grades for “Put JS at the bottom” and “Minify JS” because, hey, there’s only one file, right? This one move improved our YSlow score more than any other step.

The score at this point: 88 points, a high B. We’re still suffering for not having minified our JS (a little, and we’ve decided not to bother, since it’s being gzipped anyway) and for not using a CDN.

Cheating: The CDN point is one of the most hotly-contested on the YSlow report card, because most sites as small as ours see very little return on the investment of putting our site on a Content Delivery Network such as Akamai. It’s possible, though, to get YSlow to turn a blind eye to your CDN-less-ness. Just add your own site’s domain to the list of CDN servers in YSlow’s preferences. Doing that jumped our score up to 98–pretty close to perfect.

Unfixable: One thing which may keep your site from ever having a good YSlow report is using a lot of outside components. Common Kitchen gets dinged because its page loads include calls to two different ad networks, not to mention multiple images from Amazon on some pages and the Google Analytics code. This makes for a lot of un-bundle-able scripts, multiple DNS requests, and a lot of components where we don’t control their ETags (or lack thereof), expiration header, or compression. Of course, we can’t control their server uptime, either, so it may be that the YSlow scores are the least of our problems!

filed under: Ruby on Rails | comments (1) | read more...

Mar
24
Building LAPACK and Ruby’s linalg on Mac OS X

Update, 7 September 2008: Before you actually do anything with this post, make sure to read the update and comments at the bottom.

Installing Ruby’s linalg linear algebra library on a Mac OS X system is problematic because linalg is built around LAPACK, the Linear Algebra PACKage, and OS X (at least the version I’m working with, 10.4.11) ships with a bastard version of LAPACK which is missing some important symbols for linalg. The way around this is to install a full version of LAPACK.

You’d think this would be easy, but LAPACK is written in FORTRAN, and the version of gcc included with Xcode doesn’t include a FORTRAN compiler by default. So before we can install LAPACK, we need a FORTRAN compiler.

There are at least two gcc-based FORTRAN compilers out there, and they both offer pre-compiled binaries for Mac OS X (Intel or PowerPC). You can build from source if that’s how you want to do it, but I want this over with, so I’m grabbing the Intel binaries and getting on with my life. LAPACK seems to have trouble dealing with the g95 compiler, so we went with gfortran. gfortran has a nice Mac-like .dmg installer, so you can just download that, click through the installation, and gfortran is ready in /usr/local/bin/ (which is hopefully in your $PATH). You can make a link so that the g77 command (the old gcc compiler for the FORTRAN77 standard) points to gfortran with this command:

sudo ln -s /usr/local/bin/gfortran /usr/local/bin/g77

Now you’ll want to start in on LAPACK. Download the tarball from http://www.netlib.org/lapack/lapack.tgz and store it in /usr/local/src/ as before. Unpack with

tar xzvf lapack.tgz

You’ll have created a directory named e.g. lapack-3.1.1. cd into this directory. What’s missing from LAPACK is the standard ./configure step; we’ll have to edit the make.inc file ourselves before running make to build the package.

Fortunately, Robert Hatcher builds LAPACK as part of his CERNLIB build, which means that the shell commands for creating a working LAPACK make.inc are available as part of that script. Here’s the relevant excerpt:

# customize makefile
sed -e 's/_LINUX/_DARWIN/' make.inc.example > make.inc
echo "" >> make.inc
echo ".SUFFIXES : .f .o" >> make.inc
echo "" >> make.inc

# go ahead and build - "all" will perform tests
make blaslib lapacklib tmglib > make.log 2>&1
if [ $? != 0 ] ; then
echo “*** Error in make blaslist lapacklib tmglib ***”
grep -i err make.log
fi

Now: this step will take a while. If you copied this all into a file and ran it as a shell script (the sane thing to do, I think), it will take a good while to run, on the order of ten or fifteen minutes; if you are keying in the commands line by line, it will pause long after the make blaslib lapacklib tmglib line. Don’t panic; this means it’s working. (If you’re paranoid and like seeing stuff stream across your screen to prove you’re compiling something, you may want to background the process and then use tail -f make.log to get the full output.)

Once it’s done, it’s time to put these files where they belong:

sudo cp blas_DARWIN.a /usr/local/lib/libblas.a
sudo cp lapack_DARWIN.a /usr/local/lib/liblapack3.a

(Note that it may be the case that there are faster BLAS libraries out there; if you’re squeezing every cycle out of your app, it may be worth looking into that, but it’s beyond the scope of this post.)

Unfortunately, we’re still not done. linalg still needs several libraries from the f2c package, which is quite hard to dig up. The best route I’ve found is to grab the package available through Fink. The trick is that Fink installs libraries in /sw/lib/ and we need them elsewhere (/usr/local/lib/ should work). Use a link to solve that:

sudo ln -s /sw/lib/libf2c.a /usr/local/lib/

Now it’s (finally) time to install linalg. Unfortunately, there’s no gem available for this that I’m aware of. (This may be because the package has been essentially “done” for four or five years, so it’s older than the widespread use of gem.) Download the tarball from the project page to /usr/local/src/ and un-tar it; you’ll get a folder named linalg-0.3.2. cd into that folder, and you should be able to use

sudo ruby install.rb

…but you can’t, actually. This builds most of the files you need, barring two; it will start the installation, but eventually stall because it’s missing two .so files, ext/linalg/linalg.so and ext/lapack/lapack.so. These are “Shared Object” files, akin to Windows DLLs, but the Makefiles in these directories defines the DLLIB macro as ending with the .bundle extension, and linalg.bundle is what gets built.

So, we brute-force it by breaking the process down into “make” and “install”, and in between we create those .so files.

ruby install.rb make
cp ext/linalg/linalg.bundle ext/linalg/linalg.so
cp ext/lapack/lapack.bundle ext/lapack/lapack.so
sudo ruby install.rb install

If you don’t trust this hack, put in ruby install.rb test before the install task to verify that everything works. I’m not sure why the package tries to install an .so file its own makefiles don’t build; if someone can figure that out and patch it, I’m sure the maintainers would love to know.

If you find any obvious errors in this, or see some steps we can stick, feel free to comment and we’ll make edits. Hopefully this will come in handy for someone.

Update, 7 September 2008: Be sure to read through the comments to where James Lawrence, linalg’s maintainer, points out the new (as of yesterday) 1.0.0 release which resolves most of these problems. If you’re struggling with linalg and aren’t using 1.0.0, try that new version.

filed under: Common Running, Ruby on Rails | comments (12) | read more...

Dec
8
Restarting Mongrel clusters with Capistrano 2

There’s (still) a glitch between the mongrel_cluster gem and Capistrano 2 (we’re using mongrel_cluster 1.0.5 and Capistrano 2.1.0, for reference) where the application restart at the end of a cap deploy fails with an error like this:

Couldn't find any pid file in '/var/www/[application]/current/tmp/pids’ matching ‘dispatch.[0-9]*.pid’

I’m not sure what’s causing this, but the solution comes at the end of this post, under the heading “Restart Mongrel.” Due to issues with Cap 2 and sudo, though, the provided script fails for us. We’re running the updates as root (bad idea, but it gets around the cap sudo issues) so I updated the task like this:

# Restart task

set :mongrel_config, "/etc/mongrel_cluster/#{application}.yml"

namespace :deploy do

	task :restart do
		run "mongrel_rails cluster::restart -C #{mongrel_config}"
	end

end

I also commented out the “:mongrel_conf” variable from our previous configuration, which it appears that Capistrano was ignoring anyway.

filed under: Ruby on Rails, System Administration | comments (0) | read more...

Nov
15
More elegant Mongrel restarts

Having explained our hack for bringing back Mongrel after a server crash, I discovered that our hosting company has a different approach. Their method has the advantage of not requiring scripts with hard-coded paths; on the other hand, it does require you to patch Mongrel itself, which makes things interesting come upgrade time. (On the other hand, the hypothetical Mongrel update may incorporate this patch, or avoid the problem some other way.)

Their method patches Mongrel to test whether the processes enumerated in the PID file(s) are actually still running. If the process is dead (that is, the PID file belongs to a Mongrel instance which died in a server crash,) the file is declared “stale” and cleared, allowing Mongrel to start up properly; if the process exists, the PID file is not stale, and Mongrel aborts startup as it was originally designed.

The method is detailed here (scroll down to the heading, “Stale PID files preventing Mongrel to start up,” ungrammatical though it may be.)

filed under: Ruby on Rails | comments (0) | read more...

Nov
14
Bringing Mongrel back from a server crash

When a server crashes, Mongrel (or a Mongrel cluster) obviously doesn’t get a chance to shut down cleanly. This means it leaves behind the files it uses to store its process IDs. When the server restarts, the Mongrel startup script attempts to start the daemon(s), but on finding these PID files are already present, assumes (incorrectly) that Mongrel is already running, and cancels startup, saying, “PID file log/mongrel.pid already exists. Mongrel could be running already. Check your log/mongrel.log for errors.”

This is technically correct behavior–after all, what if Mongrel really is already running?–but it makes it nearly impossible to bring Mongrel back automatically after a server crash; one would have to manually delete the PID file(s) and then start the daemon(s).

If you don’t have systems administrators tending your websites 24/7, you need a better solution. We considered hacking the init script (found at /etc/rc.d/init.d/mongrel_cluster on our Fedora Core server) but found that the necessary logic made the script too complicated. Instead, we created a new startup script, filed at /etc/rc.d/init.d/mongrel_cleanup to solve the problem.

The mongrel_cleanup script is set to run at the same run-levels as mongrel_cluster. On shutdown or restart, it does nothing, but on start, it checks for the presence of the PID files and deletes them if they’re found. It therefore has to run before mongrel_cluster, which is why the priority number is 84 for startup and 16 for shutdown: mongrel_cluster is 85 and 15.

To use this script, save it in /etc/rc.d/init.d/mongrel_cleanup (or whatever the appropriate script directory is) and then put it in the startup queue with these commands:

# chkconfig --add mongrel_cleanup
# chkconfig --level 345 mongrel_cleanup on

Also, edit this script. I’ve hardwired the paths and names of our Mongrel cluster PID files; you will want to change your paths, or let me know if you come up with a more elegant method!

#!/bin/bash
#
# Parker Morse for Common Media, Inc., 9 November, 2007
#
# mongrel_cleanup      Startup script to recover from crashes.
#
# chkconfig: - 84 16
# description: A hack to clear PID files left behind by Mongrel clusters
#              after an unscheduled server crash. Checks for the presence
#              of these files and deletes them if found.
#              

RETVAL=0
PIDFILE_DIR=/path/to/app/current/log

# Gracefully exit if the controller is missing.
#which mongrel_cluster_ctl >/dev/null || exit 0

# Go no further if config directory is missing.
#[ -d "$CONF_DIR" ] || exit 0

case "$1" in
    start)
      if test -s $PIDFILE_DIR/mongrel.8000.pid
          then
          /bin/rm $PIDFILE_DIR/mongrel.8000.pid;
      fi
      if test -s $PIDFILE_DIR/mongrel.8001.pid
          then
          /bin/rm $PIDFILE_DIR/mongrel.8001.pid;
      fi
      if test -s $PIDFILE_DIR/mongrel.8002.pid
          then
          /bin/rm $PIDFILE_DIR/mongrel.8002.pid;
      fi
      if test -s $PIDFILE_DIR/mongrel.8003.pid
          then
          /bin/rm $PIDFILE_DIR/mongrel.8003.pid;
      fi
      RETVAL=$?
  ;;
    stop)
      exit 0
  ;;
    restart)
      exit 0
  ;;
    *)
      echo "Usage: mongrel_crash_cleanup {start|stop|restart}"
      exit 1
  ;;
esac      

exit $RETVAL

filed under: Ruby on Rails, System Administration | comments (2) | read more...

« Older Entries
© 2008 Common Media, Inc. | Theme by DemusDesign and Theme Lab | Powered by WordPress