Installing and Configuring MolProbity
Vincent B. Chen (vbc3@duke.edu), 09 Jan
2008
This document describes how to install and configure MolProbity and its
associated software. The MolProbity suite provides tools for
macromolecular structure validation through either a command-line or
web-browser interface.
- System requirements
- Steps for installation
- Configure the web server
- Configure PHP
- Unpack MolProbity
- Configure MolProbity
- Check configuration
- Secure the admin pages and user data
- Troubleshooting / FAQ
System requirements
MolProbity requires both a Unix runtime environment and a modern Java
virtual machine (i.e. one from Sun Microsystems). It is known to run on
x86 Linux systems and on Mac OS X. It is possible that with some
modification MolProbity could be run on Windows under the Cygwin
environment; if you'd like to try this project, please contact us.
It is important that you run Sun Java,
and not GNU Java. GNU Java is
not a complete or fully-compliant Java implementation, and in our tests
it does not run the programs MolProbity needs. Be warned that
many modern Linux distributions have GNU Java pre-installed (e.g. as
/usr/bin/java). You should download and install the "real" Java from java.sun.com. Mac OS X, on the other
hand, has Sun Java pre-installed.
In the lab, we use a quad-processor (two dual core) Intel core box running MacOSX 10.4 (server)
as the public server; we use the included versions of Apache and PHP. We have
a number of test servers, running on AMD Athlon X2 processors with Fedora (mostly
with default Apache and PHP), and on Intel core MacOSX 10.4 boxes running
the default Apache and the default PHP (or PHP 5.0.4 from entropy.ch).
Configure the web server
Although MolProbity can be used just as a collection of command-line
tools, most people will also want access to the web interface, at least
for the monitoring and maintainance functions. In theory, any web
server that supports PHP should be fine. In practice, we've never used
anything but Apache. The following tips should cover all the cases that
occur in a "normal" httpd.conf file on RedHat or Mac systems. (This
file often lives in /etc/httpd/.) If you've made significant changes,
you'll need to consider how those might impact MolProbity.
- Watch out for any LimitRequestBody
directives. These can limit file uploads severely, which will make
MolProbity much less useful. Most default configurations have no such
limits, but I've seen a few that do. If there are multiple config
files, use grep to make sure.
- You should have an Options
FollowSymLinks
directive in the proper part of your httpd.conf file. A totally vanilla
MolProbity installation shouldn't require this, but many of the
proposed configurations covered later use symbolic links.
- You may want to add a line that says "text/kinemage kin"
to your mime.types file (often /etc/mime.types or
/etc/httpd/mime.types). This helps browsers to download kinemage
files correctly rather than show them as text.
After these changes, you'll need to (re)start the webserver. You can do
this from one of the graphical "services" tools in Linux, or from
System Preferences | Sharing on the Mac.
Configure PHP
You'll need a pretty recent version of PHP in order to run MolProbity
correctly. Due to a long-standing bug in the printf() function (of all
things), you must have version 4.3.7 or newer. We recommend the latest
stable build in the 5.x line. If you have to install this package
yourself in order to get a new enough version, make sure that Apache
uses the correct version (see PHP documentation) and that MolProbity
has the correct version on its PATH (see below).
The following settings often need to be changed in php.ini (usually
found in /etc/):
- post_max_size and upload_max_filesize both limit
the size of uploaded files, and the default values are too small for
many large PDB files. In some cases, memory_limit may also need to
be increased.
- Of course, file_uploads
must be set to 1 (true) in order for that to matter.
- allow_url_fopen must
be on.
- safe_mode must be
off.
- "Magic quotes" should be disabled, or else your feedback emails
will get garbled.
- display_errors
should be enabled, at least until you have a working, debugged
installation. After that, if you're paranoid, you can turn it back off,
but it will make debugging future problems much harder.
- session.use_only_cookies
should be disabled, or it will appear that molprobity does nothing when
anything is clicked on.
Unpack MolProbity
There are at least two possible scenarios here. For production use, we
recommend creating a new user account (we call him "moler"). You then
unpack the MolProbity bundle in moler's home directory, and you end up
with a bin/ directory, a public_html/ directory, etc. in keeping with
the usual Linux conventions. This way, Apache is already set up to
access MolProbity's public_html as http://your.web.server/~moler,
and the other directories (like config/) are protected from view. If
you want a better URL later on, you can make a symbolic link from
somewhere in /var/www/html
(or wherever your root web directory is) to /home/moler/public_html. In
fact, you can make http://your.web.server/
be the address for MolProbity like this:
cd /var/www;
rmdir html; ln -s /home/moler/public_html html
(Don't try actually moving
MolProbity's public_html somewhere else; it won't work. And don't
install the whole MolProbity package under /var/www/html, or you'll
also be exposing all the private data -- config/, bin/, etc. Not a huge
risk, really, but don't do it.)
For quick testing and personal use, you can install an (insecure) copy
of MolProbity in your public_html (Linux) or Sites (Mac) folder. Just
remember that's a public_html within a public_html, so the URL's going
to look something like http://your.web.server/~your.login/molprobity/public_html.
In either scenario, it's critical that the full path to the directory
where MolProbity
is installed not have any spaces or other funky characters in it.
Alphanumerics, underscores, dashes, and dots are OK. Everything else is
off limits. Those weird characters interfere both with passing paths in
URLs (they have to be encoded and decoded) and with Unix command line
programs (all paths have to be enclosed in extra quote marks). If you
follow the suggestions above, you should be safe.
The latest stable release of MolProbity can be downloaded as a ZIP file
from the Richardson lab website at http://kinemage.biochem.duke.edu.
If you like to live on the bleeding (and sometimes broken!) edge, we
can arrange a Subversion account for you to have access to the
development code.
Configure MolProbity
After unpacking MolProbity, you should run the setup.sh script from the
directory it resides in. It will create a few log files, ensure that
the permissions are set properly on MolProbity directories, and make a
few symlinks.
By default, all data from user sessions is stored under
public_html/data/. If you want to store data somewhere else, you can
just replace the public_html/data/ directory with a symlink to your
chosen location. For this to work, you must allow Apache to follow
symlinks, as suggested above. Remember to check the permissions and
re-run the setup.sh script. A suggestion: make sure the basic
installation works before you try this variation.
What configuration there is for MolProbity is stored in
config/config.php, which you can edit with any text editor. All the
settings are documented in the file, but most are fine the way they
are. You will probably want to change MP_BIN_PATH, to make sure
MolProbity can find all the necessary helper programs. (Setting moler's
PATH environment variable will not accomplish anything, because
MolProbity runs in a web server environment.) The configuration
checking script will let you know if anything's missing.
You may want to increase MP_REDUCE_LIMIT, which will let Reduce keep
working on optimizing complicated H-bonding cliques at the cost of some
additional processor time. Whether you can afford it depends on how
many users you're supporting and how powerful your server is.
You may also want to change the lifetime of a session before it's
automatically garbage-collected, and you may want to allocate more/less
disk space for each user session.
Check configuration
If you skipped configuring your web server, you will now regret it. To
make sure everything's configured properly, you should direct your
browser to the check_config.php script under public_html/admin/. The
URL will be something like http://localhost/~moler/admin/check_config.php.
It will check the PHP settings suggested above and ensure it can find
all the needed helper programs. It will also check and/or display
version numbers for several of the core programs.
While you're at it, look around at the other admin tools. You can
monitoring usage of the site, remove or debug various working sessions,
and more.
If you try a file:// URL, you're just going to see the raw PHP source
code. You must access this script via your web server in order for it
to get executed by PHP.
If you didn't configure a web server, you might try something like
php -f
public_html/admin/check_config.php > tmp.html
and then open tmp.html in a browser.
Secure the admin pages and user data
The reason most people (drug companies) run their own copy of
MolProbity is to protect their non-public structural data. Given that,
you'll want to read this section carefully. We've made MolProbity as
secure as we know how, but there are at least two big loopholes in an
out-of-the-box installation.
First, anyone can access the admin/ directory. This allows them to
delete other users' sessions and get access to files within them. While
this may occasionally be useful for an adminstrator, it's not the sort
of thing you want everyone doing. If you're really paranoid, you'll
want to delete that directory altogether, or at least move it somewhere
that Apache doesn't have access to. Alternately, you could change the
ownership and permissions to keep Apache out. If you want reasonable
(though not airtight!) security and
the convenience of the admin tools, you can password protect the
directory. See the
Apache documentation for a full discussion along with the important
caveats, but the basic process looks like this:
Create a file called .htaccess in the admin/ directory. Put something
like this in it:
AuthType Basic
AuthName "My MolProbity server"
AuthUserFile /home/moler/public_html/admin/.htpasswd
Require user some_user_name
Then run a command like this to create the password:
htpasswd -c
/home/moler/public_html/admin/.htpasswd some_user_name
You may also have to enable password protection in your httpd.conf file
with something like this:
<Directory /home/*/public_html>
AllowOverride AuthConfig
</Directory>
Remember to restart the web server after making these changes.
The second security problem comes from Apache. All the user data is
stored in directories under public_html/data/. That directory has to be
visible to the web so users can get at their files, but providing an
index of the data directory (which most default Apache configurations
will) allows someone to see all the sessions and browse through their
contents. This is easy to fix; just remove "Indexes" from the
appropriate Options line in your httpd.conf file. Again, restart the
server when you're done.
Troubleshooting / FAQ
There are no entries yet. Send your questions to vbc3@duke.edu.