Baseline Solutions Demo

| |

We’ve mentioned Baseline Solutions before – it is a project we’re very proud of.

Baseline® is a suite of tools which allows a contract reviewer to integrate best practices and shared knowledge directly into a document. The tool is for use by anyone who reviews contracts — controllers, CFO’s, CEO’s, paralegals, contract administrators, contract managers and lawyers.

Baseline works by reading the uploaded contract and then generating a snapshot summary of the contract. With a further click of the mouse, Baseline generates a markup of the document incorporating best practices into the document in Microsoft Word track changes mode. The best practices take the form of inserted clauses, deleted clauses and hyper links to a Knowledgebase with a further explanation of the clauses.

Scott Soloway, founder and President of Baseline Solutions Corporation, has recorded a video demo if you’re interested in seeing exactly what the app does:

Simple Perl scripts for OS X users using Platypus

| |

Maybe everyone who’s ever done anything with Perl has written an email extraction script, but a recent client request asked us to take it one step further: allow the script to function as a Mac OS X “droplet.” In other words, the script should be an icon in the Finder, and when files are dragged and dropped on the icon, it should run the script.

This was easy in the days of Mac OS 9 and MacPerl; because there was no Unix core to the Mac OS, MacPerl was the only Perl environment, and the MacPerl environment took care of things like drag-and-drop file access (after all, that was the only way to provide a filename as an argument to a Perl script on a Mac.) Nowadays, with a standard Unix-y Perl shipping with the Mac OS, MacPerl is no longer needed, but the handy drag-and-drop functions aren’t there.

Filling the gap is Platypus, a handy little utility which makes application bundles out of Unix scripts. Notice I didn’t say “Perl” there. Platypus plays nicely with shell scripts, Python, PHP, Ruby, and Tcl just as easily as Perl. As the developer describes it, “this is done by wrapping the script in an application bundle directory structure along with an executable binary that runs the script.”

Advanced options allow you to configure how script output is shown, whether to allow dropped files as input (you can also filter which file types are accepted) and what to do at script conclusion.

In the case of our email extraction utility, which originally spat out the addresses it found on STDOUT, we found that displaying the script output in a text window for some reason only read one input file. We adjusted the script to write output to a file, then used the text window to show which files were read and the path to the output file (as well as any errors), and that adjustment allowed for multiple input files.

If you’ve ever wanted to double-click your scripts rather than running them from the command line, give Platypus a look.

Our graduate work still going strong

| |

While we’re tooting our own horns, an interesting link arrived at our browser this afternoon via a roundabout route. According to EduDemic, the college search site collegesurfing.com ran a feature during the Winter Olympics of “The Web 2.0 College Olympics,” running down “50 Social Media Innovators in Higher Education.”

Noah and Parker’s graduate alma mater, Tufts University, topped the “gold medal winners” list, and while the description leaned heavily on Twitter and Facebook for all contestants, both EduDemic and collegesurfing.com specifically cited Tufts’ “Spark” project. Parker was part of the original Spark team as part of his graduate work, hacking blog software and working on the XSLT bridges Spark used to tie together the project components.

Quick HTML/CSS demo

| |

People often ask us what we do when a designer sends over a PSD, so I made some screen captures from the early stages of a recent build. Next time we’ll do a longer one, right through getting all the fonts and images perfect.

Design by our friend Justin Zucco, and I’ll post a link to the live site (Balmat Law) when it goes live in a week or two.

http://www.youtube.com/watch?v=mSC84CUx_Uk

Nginx configuration tweaks for client-side speed optimization

| |

As we move more sites to an Nginx/Passenger server stack, we need to translate the server-side optimizations we use for browser-side performance from Apache configuration to Nginx. Here’s how that comes over:

Gzip compression

In Nginx, this is an optional module which should be built in to the software at compile time. If it’s present, you can activate it in site configuration with a block like this:

 gzip on;
 gzip_min_length  1100;
 gzip_buffers     4       8k;
 gzip_proxied     any;
 gzip_types       text/plain text/css application/x-javascript text/xml application/xml application/xml+rss text/javascript;

Note that Nginx considers text/xml and text/html to be redundant MIME types.

Long expiration dates

We want to have our assets cached in the user’s browser cache as long as possible, with a few exceptions (e.g. advertising assets.) To that end, we set expiration headers:

       location ~* \.(js|css|jpg|jpeg|gif|png)$ {
                if (-f $request_filename) {
                   expires                        max;
                   break;
                }
       }

This uses the filename extension to determine which files get “far future” expiration dates. If you’re used to naming files without extensions, this might not work so well for you.

By default, Nginx doesn’t use Etags, so we can stop worrying about that.

Surviving Django traffic spikes with Nginx and memcached

| |

Normally I wouldn’t categorize a multi-national organization with a seven-figure annual budget as a “small” client, but we didn’t build the site for the World Marathon Majors, a sort of mass-marathon trade group made up of the Boston, London, Berlin, Chicago and New York City marathons. Instead, we inherited their Django-based site from a previous partner, and our job is to keep it running, fix what breaks, and make modest upgrades.

The site’s traffic is flat for most of the year, but race day for any of the five events produces a spike in the traffic graph resembling a shark fin in a calm sea. Last month, with both Boston and London on the horizon, we took steps to ensure the site and its new hosting could handle the load.

Because the site does not ask users to register, we recognized that we could cache complete HTML pages as they were rendered by Django and expire them after an arbitrary time span. If we kept rendered pages in the cache for ten minutes, for example, no page would be rendered more frequently than once every ten minutes, and yet news updates would never be delayed by more than ten minutes waiting for a cache refresh.

Because we were already using Nginx as the front-end of our server stack, we followed this stellar write-up by Alex Holt describing a simple system where Django writes to the cache, and Nginx reads from it. By using memcached for our cache rather than writing to disk or a database, we realized a little extra benefit: whenever Nginx has a cache hit, the page is returned significantly faster than if there’s a cache miss and Django has to render the page. This isn’t surprising when you think about the whole stack–of course a simple read will always be faster than a dynamic page build with multiple database queries–but gave us a nice Zen server moment. By doing less work, the server was getting more done.

And the whole stack survived London with flying colors. Now we just have to figure out how to unwind the site’s designed-in assumption that UK is a legitimate language code for English.  (Hint: People who speak “UK” use the Cyrillic alphabet.)

Common Running on hold

| |

We’ve shut down the running application at commonrunning.com, and redirected all traffic to that domain to this post.

The Common Running idea suffered from some serious drawbacks. One of those was a cold start problem, where the functionality of the site for early adopters was limited until we had reached a critical mass of user input (a level the site never reached). Another was the problem of basic data; as long as we were soliciting reviews, we were obligated to maintain a comprehensive, high-quality database of current and recent running shoe models, and we were unable to meet that obligation while also performing paying client work.

We still hold out some hope for the CR technology model, perhaps with a plan in place to address the problems outlined above. For the time being, however, we won’t pretend the site is useful and will be concentrating our energy on other projects.

(Check the collection of Common Running posts for some bits of the two-year history of the site.)

26.2 miles of data

| |

We work on an occasional consulting basis with the B.A.A., organizers of the Boston Marathon. The 114th running of that venerable race happened on Monday, April 19th. Quite a bit has changed in the operation of a major marathon since 1897, and Network World produced this report on how the B.A.A.’s other technology partners handle the information stream created by ~25,000 runners (23,176 starters and 22,678 finishers according to the B.A.A.) who, despite participating in one of the world’s oldest and simplest sports, have since the mid-1990s been accompanied by RFID chips.

If you’re curious about transponder-based timing technology, I wrote an article for New England Runner in late 2008 which, while already obsolete in terms of the current state of the art, pretty much captured things at the time. The video above was brought to our attention by Josh Merlis of Albany Running Exchange, whose event production team does some pretty nifty stuff with transponder data, PHP, and occasional peripherals like on-course digital cameras.

We’re hiring!

| |

We’re hiring! We are currently in need of a front-end website developer.

We’re a small custom development house, with new custom Content Management Systems about half our current work, and half ongoing feature development for existing clients. We’re conveniently located right downtown in Amherst, overlooking the Common.

XHTML/CSS and recent web development experience is necessary. A design background, Ajax, Javascript and/or advanced CSS-fu are definite pluses. We’re looking for someone excited about the modern web, and the exact details of your résumé are less important than if you are “smart and get things done.”

Primary responsibilities will include:

  • Dynamic front-end web development using XHTML and CSS.
  • Updates, new features and maintenance for existing web applications, and new development to client specifications.
  • Thoroughly testing code written by yourself, as well as others.

Necessary skills and attributes:

  • Ability to write consistent, standards-compliant XHTML and CSS.
  • Ability (and eagerness) to learn new skills on the job.
  • Attention to detail and production schedules.

Compensation commensurate with experience, plus benefits. Please submit a resume and cover letter to jobs@commonmediainc.com. This is an on-site position only, no telecommuters, please.

ETA: We have all the applications we need at the moment, thank you. We’ll update this page if that changes.

Rejecting mail for valid local users

| |

Several months ago we mentioned that certain Linux distributions will, if they are running an SMTP server and accepting mail for local users, accumulate spam for “role” users who will never read their mail, e.g. mail, uucp, news, etc.

I finished by suggesting,

The best shortcut here is to bounce any email destined for the role users. This will vary depending on your MTA, so I won’t detail it here.

It’s true that it varies by MTA, but it turns out it’s really hard to find this information. MTAs aren’t set up to reject mail from specific addresses; they want a specific list of valid addresses and they’ll reject everything else.

It turns out that there’s a faster and cleaner method which is MTA-independent: lock the mailboxes for those users.

This could be as simple as putting an empty file in /var/spool/mail/uucp (for example) which is owned by root with 600 permissions, but that’s going to generate a bunch of error messages when your local delivery agent tries to write to a file it doesn’t have permissions for. A more elegant solution is to symlink those paths to /dev/null:

ln -s /dev/null /var/spool/mail/uucp

Now make sure the symlink has the correct ownership…

chown -h uucp:mail /var/spool/mail/uucp

Now all that spam will be silently delivered to the bit-bucket.