Mongrel Cluster and Apache Need Memory

I use a VPS hosted by SliceHost as my personal server. SliceHost uses Xen to host multiple instances of Linux on a single machine. The performance of this setup has been very good.

I have been running:

  • Apache 2.2 with PHP
  • MySQL 5
  • Postfix Mail Server
  • Courier IMAP Server
  • ssh for remote access of course

I recently started playing with a site built using Radiant CMS which is itself built on top of Ruby on Rails. So, I’ve added to the mix:

  • 3 Mongrel instances running under mongrel_cluster

These mongrel instances are proxied behind Apache using mod_proxy_balance as described here. This setup works very well and is more and more becoming the defacto standard for deploying Rails applications. Even the Ruby on Rails sites are deployed with this setup now. It allows you to serve all of your dynamic content through Rails and all of your static content through Apache. This gives you all of the speed and robustness that Apache has to offer (afterall it runs over 50% of all the hosts on the internet) for serving static content without burdening Mongrel with this task.

I was noticing that the site was pretty slow though. I tracked it down to the fact that I had started using too much memory. I was running the site on a VPS with 256M of RAM, but with the new Mongrel instances I had just pushed my server into swap space. Web applications in general are happier with more RAM. In this case it is definitely born out. I upped the VPS to have 512M of RAM and things became VERY SNAPPY! While I didn’t do a scientific before and after. The page loads prior to the upgrade were taking about 5-10s. After the memory increase you can’t tell if the application is static or dynamic.

So, if you’re running into performance issues with Mongrel behind an Apache mod_proxy_balance setup, check your memory. If you are running into swap space then you are likely to see serious performance issues. Let me know of any other simple tweaks to get more performance out of this setup if you have them.

As an aside:
Big kudos to SliceHost on their VPS upgrade capabilities. I clicked 2 buttons on my web-based management console and about 10 minutes later I was running on a bigger VPS. You can’t ask for much better than that if you need to scale up a server!

Update:
I guess Lighttpd and Nginx do both support running PHP applications under fast_cgi. You might want to try this kind of setup if you are so inclined. I’m still an Apache partisan.

Subversion Maintenance Scripts

With the recent release of Subversion 1.4 I’m going through the process of updating repositories. I often like to use multiple repositories because Subversion versions the entire repository with each checkin (Atomic commits rule!). When I do that, I group all of the repositories in the same directory. It makes it really easy to configure Apache with the SVNParentPath directive. And if you’re using svn+ssh or something like that, then it’s just easier to remember where they are.

SVNParentPath Example

With this configuration it’s very easy to bring new repositories up because it doesn’t require any configuration change in Apache.

<Location /svn>
    DAV svn
    SVNParentPath s:/svn
    SVNListParentPath on
    SVNAutoversioning on
 
    # authentication
    AuthType Basic
    AuthName "Subversion Access"
    AuthUserFile s:/logins.txt
    # Access is configured in the access file.
    AuthzSVNAccessFile s:/access.txt
    Require valid-user
</Location>

The one obvious downside to having multiple repositories is managing things like backups and upgrades. You have a lot more to do than just dump and load a single repository. So to help handle that task, I created some simple shell scripts.

Dump All Repositories

This script will dump all of the SVN repositories in a given directory.

#!/bin/sh
 
# Assumes each directory is an SVN repository
# and creates a dump file for each of them.
 
for i in *; do
    svnadmin dump $i > ${i}.dump;
done;

Recreate/Load All Dump Files

This script will take all of the dump files and create new repositories and then load the dump file into them. This is good for major revision changes such as to 1.4 where they have changed some structures and improved the efficiency of storing binary files for example.

#!/bin/sh
 
# Assumes a directory full of dump files
# Creates a new SVN repository and loads the dump file into it
 
for i in *.dump; do
    # The %% syntax is a substring command in bash to strip off the 
    # last occurrence of the string that follows so Foo.dump -> Foo
    repos=${i%%.dump};
    svnadmin create $repos;
    svnadmin load $repos < $i;
done

Create New Repository Script

This is a handy one I keep around to make it easy to help enforce best practices. This creates a repository and then automatically creates the trunk/ branches/ and tags/ directories.

#!/bin/sh
 
if [ -z "$1" ]
then
    echo "Usage: $0 <repository>";
    exit 1;
fi
 
# set up directories
cur_dir=`pwd`
# use these 2 if you're running on Cygwin under Windows
# cur_cyg_dir=`pwd`
# cur_dir=`cygpath -m $cur_cyg_dir`
svn_dir=${cur_dir}/${1}
 
svnadmin create "$1"
svn -m "Create default structure." mkdir file:///${svn_dir}/trunk file:///${svn_dir}/tags file:///${svn_dir}/branches
 
echo "done"

Not that these are incredibly complex shell scripts, but they can do a lot of work while you do other things (like post to your blog). I hope they’ll help someone out there.

And just to be warned, if you’re doing this on a bunch of big repositories it can take some time!