Installing and Configuring MolProbity

Vincent B. Chen (vbc3@duke.edu), 09 Jan 2008

This document describes how to install and configure MolProbity and its associated software. The MolProbity suite provides tools for macromolecular structure validation through either a command-line or web-browser interface.

System requirements

MolProbity requires both a Unix runtime environment and a modern Java virtual machine (i.e. one from Sun Microsystems). It is known to run on x86 Linux systems and on Mac OS X. It is possible that with some modification MolProbity could be run on Windows under the Cygwin environment; if you'd like to try this project, please contact us.

It is important that you run Sun Java, and not GNU Java. GNU Java is not a complete or fully-compliant Java implementation, and in our tests it does not run the programs MolProbity needs.  Be warned that many modern Linux distributions have GNU Java pre-installed (e.g. as /usr/bin/java). You should download and install the "real" Java from java.sun.com. Mac OS X, on the other hand, has Sun Java pre-installed.

In the lab, we use a quad-processor (two dual core) Intel core box running MacOSX 10.4 (server) as the public server; we use the included versions of Apache and PHP. We have a number of test servers, running on AMD Athlon X2 processors with Fedora (mostly with default Apache and PHP), and on Intel core MacOSX 10.4 boxes running the default Apache and the default PHP (or PHP 5.0.4 from entropy.ch).

Configure the web server

Although MolProbity can be used just as a collection of command-line tools, most people will also want access to the web interface, at least for the monitoring and maintainance functions. In theory, any web server that supports PHP should be fine. In practice, we've never used anything but Apache. The following tips should cover all the cases that occur in a "normal" httpd.conf file on RedHat or Mac systems. (This file often lives in /etc/httpd/.) If you've made significant changes, you'll need to consider how those might impact MolProbity.
After these changes, you'll need to (re)start the webserver. You can do this from one of the graphical "services" tools in Linux, or from System Preferences | Sharing on the Mac.

Configure PHP

You'll need a pretty recent version of PHP in order to run MolProbity correctly. Due to a long-standing bug in the printf() function (of all things), you must have version 4.3.7 or newer. We recommend the latest stable build in the 5.x line. If you have to install this package yourself in order to get a new enough version, make sure that Apache uses the correct version (see PHP documentation) and that MolProbity has the correct version on its PATH (see below).

The following settings often need to be changed in php.ini (usually found in /etc/):

Unpack MolProbity

There are at least two possible scenarios here. For production use, we recommend creating a new user account (we call him "moler"). You then unpack the MolProbity bundle in moler's home directory, and you end up with a bin/ directory, a public_html/ directory, etc. in keeping with the usual Linux conventions. This way, Apache is already set up to access MolProbity's public_html as http://your.web.server/~moler, and the other directories (like config/) are protected from view. If you want a better URL later on, you can make a symbolic link from somewhere in /var/www/html (or wherever your root web directory is) to /home/moler/public_html. In fact, you can make http://your.web.server/ be the address for MolProbity like this:
    cd /var/www; rmdir html; ln -s /home/moler/public_html html
(Don't try actually moving MolProbity's public_html somewhere else; it won't work. And don't install the whole MolProbity package under /var/www/html, or you'll also be exposing all the private data -- config/, bin/, etc. Not a huge risk, really, but don't do it.)

For quick testing and personal use, you can install an (insecure) copy of MolProbity in your public_html (Linux) or Sites (Mac) folder. Just remember that's a public_html within a public_html, so the URL's going to look something like http://your.web.server/~your.login/molprobity/public_html.

In either scenario, it's critical that the full path to the directory where MolProbity is installed not have any spaces or other funky characters in it. Alphanumerics, underscores, dashes, and dots are OK. Everything else is off limits. Those weird characters interfere both with passing paths in URLs (they have to be encoded and decoded) and with Unix command line programs (all paths have to be enclosed in extra quote marks). If you follow the suggestions above, you should be safe.

The latest stable release of MolProbity can be downloaded as a ZIP file from the Richardson lab website at http://kinemage.biochem.duke.edu. If you like to live on the bleeding (and sometimes broken!) edge, we can arrange a Subversion account for you to have access to the development code.

Configure MolProbity

After unpacking MolProbity, you should run the setup.sh script from the directory it resides in. It will create a few log files, ensure that the permissions are set properly on MolProbity directories, and make a few symlinks.

By default, all data from user sessions is stored under public_html/data/. If you want to store data somewhere else, you can just replace the public_html/data/ directory with a symlink to your chosen location. For this to work, you must allow Apache to follow symlinks, as suggested above. Remember to check the permissions and re-run the setup.sh script. A suggestion: make sure the basic installation works before you try this variation.

What configuration there is for MolProbity is stored in config/config.php, which you can edit with any text editor. All the settings are documented in the file, but most are fine the way they are. You will probably want to change MP_BIN_PATH, to make sure MolProbity can find all the necessary helper programs. (Setting moler's PATH environment variable will not accomplish anything, because MolProbity runs in a web server environment.) The configuration checking script will let you know if anything's missing.

You may want to increase MP_REDUCE_LIMIT, which will let Reduce keep working on optimizing complicated H-bonding cliques at the cost of some additional processor time. Whether you can afford it depends on how many users you're supporting and how powerful your server is.

You may also want to change the lifetime of a session before it's automatically garbage-collected, and you may want to allocate more/less disk space for each user session.

Check configuration

If you skipped configuring your web server, you will now regret it. To make sure everything's configured properly, you should direct your browser to the check_config.php script under public_html/admin/. The URL will be something like http://localhost/~moler/admin/check_config.php. It will check the PHP settings suggested above and ensure it can find all the needed helper programs. It will also check and/or display version numbers for several of the core programs.

While you're at it, look around at the other admin tools. You can monitoring usage of the site, remove or debug various working sessions, and more.

If you try a file:// URL, you're just going to see the raw PHP source code. You must access this script via your web server in order for it to get executed by PHP.

If you didn't configure a web server, you might try something like
   php -f public_html/admin/check_config.php > tmp.html
and then open tmp.html in a browser.

Secure the admin pages and user data

The reason most people (drug companies) run their own copy of MolProbity is to protect their non-public structural data. Given that, you'll want to read this section carefully. We've made MolProbity as secure as we know how, but there are at least two big loopholes in an out-of-the-box installation.

First, anyone can access the admin/ directory. This allows them to delete other users' sessions and get access to files within them. While this may occasionally be useful for an adminstrator, it's not the sort of thing you want everyone doing. If you're really paranoid, you'll want to delete that directory altogether, or at least move it somewhere that Apache doesn't have access to. Alternately, you could change the ownership and permissions to keep Apache out. If you want reasonable (though not airtight!) security and the convenience of the admin tools, you can password protect the directory. See the Apache documentation for a full discussion along with the important caveats, but the basic process looks like this:

Create a file called .htaccess in the admin/ directory. Put something like this in it:
    AuthType Basic
    AuthName "My MolProbity server"
    AuthUserFile /home/moler/public_html/admin/.htpasswd
    Require user some_user_name
Then run a command like this to create the password:
    htpasswd -c /home/moler/public_html/admin/.htpasswd some_user_name
You may also have to enable password protection in your httpd.conf file with something like this:
    <Directory /home/*/public_html>
        AllowOverride AuthConfig
    </Directory>
Remember to restart the web server after making these changes.

The second security problem comes from Apache. All the user data is stored in directories under public_html/data/. That directory has to be visible to the web so users can get at their files, but providing an index of the data directory (which most default Apache configurations will) allows someone to see all the sessions and browse through their contents. This is easy to fix; just remove "Indexes" from the appropriate Options line in your httpd.conf file. Again, restart the server when you're done.

Troubleshooting / FAQ

There are no entries yet. Send your questions to vbc3@duke.edu.