Let's start this tutorial by installing the MATLAB interface.
Installation
Throughout this tutorial, we'll be assuming you have some sort of
UNIX-based operating system, such as Mac OS X, Solaris or Linux. We
can't help you if you have a Windows operating system, but we'll
presume you have enough experience with Windows to figure out how to
modify these steps for your setup, if necessary. (If you are using
Windows, we highly recommend that you check out the
gnumex project website before
continuing.) Of course, you'll need to have a version of MATLAB installed o
n your computer (hopefully
a more recent version), and that you are relatively familiar with the
MATLAB programming language (if not, it is a very simple language and
you could learn it in a few hours). This software package has been
tested on MATLAB versions 7.2 and 7.3. It might very well work on
earlier versions of MATLAB, but there is also a good chance that it
will not. It is unlikely that the software will run with versions
prior to MATLAB 6.5.
Install compilers. The first thing you need to do is install
a C++ compiler and a Fortran 77 compiler. Now, you might already have
these installed. However, you may not be able to use these installed
compilers because you must use the precise compilers supported by
MATLAB (yes, it's a pain). For instance, on my Linux machine I have
MATLAB version 7.3, so the people at MathWorks tell me that I need to
have the GNU Compiler
Collection (GCC) version 3.4.5. If you use the incorrect version
of GCC, you will likely encounter linking errors (and the "mex"
command will tell you which compiler versions are Ok). In order to
find out which compiler is supported by your version of MATLAB, run
the mex
program provided in your MATLAB installation or
consult this
webpage.
Configure MATLAB. Once you've installed the appropriate
compilers, set up and configure MATLAB to build MEX files. This is
explained quite nicely here.
Install IPOPT. The MATLAB interface adds several
complicating matters to the standard IPOPT
installation procedure. Before you continue, please familiarize
yourself with the standard procedure. What follows are the steps
followed on a typical Linux machine. First, download the IPOPT source
files and the third-party source code (BLAS, LAPACK, HSL, etc.).
MATLAB demands that you compile the code with certain flags, such
as -fPIC
and -fexceptions
(on Linux). The
first flag tells the compiler to generate position-independent code,
and the second flag enables exception handling. Usually these flags
coincide with your MEX options file. You figure out which flags are
used on your system by running the mex compiler with the
-v
flag on a simple example source file (Hello
World is your friend). See this
MathWorks technical support webpage for more information on the
MEX options file.
Once you have all the necessary source code, call the IPOPT
configure script. On a Linux machine with MATLAB 7.3 installed, the
call should look something like
./configure --prefix=$HOME/ipopt/install \
CXX=g++-3.4.5 CC=gcc-3.4.5 F77=g77-3.4.5 \
ADD_CXXFLAGS="-fPIC -fexceptions" \
ADD_CFLAGS="-fPIC -fexceptions" \
ADD_FFLAGS="-fPIC -fexceptions"
We also installed the MATLAB interface to IPOPT on an Apple
computer running Mac OS X 10.3.9 and MATLAB 7.2. For this machine, we
ran the fconfigure script with the following command:
./configure --prefix=$HOME/ipopt/install \
ADD_CFLAGS="-fno-common -fexceptions -no-cpp-precomp -fPIC" \
ADD_CXXFLAGS="-fno-common -fexceptions -no-cpp-precomp -fPIC" \
ADD_FFLAGS="-x f77-cpp-input -fPIC -fno-common" \
FLIBS="-lg2c -lfrtbegin -lSystem" \
F77=g77 CC=gcc CXX=g++
After this, follow the standard installation steps: type
make
, wait a few minutes, then make install
in the UNIX command line. This compiles all the source code into a
single library and places it in the install directory as specified by
the prefix
variable above.
What we haven't yet done is compile the code for the MATLAB
interface. We'll do this next.
Modify the Makefile and build the MEX file. Go to into the
subdirectory Ipopt/contrib/MatlabInterface/src
and open
the file called Makefile
with your favourite text
editor. We need to change this file a little bit so that it coincides
with your MATLAB setup. You will find that most of the variables, such
as CXX
and CXXFLAGS
, have been automatically
(and hopefully, correctly) set according to the flags specified during
your initial call to configure
script. However, you may
need to modify MATLAB_HOME
and MEXSUFFIX
, as
explained in the comments of the Makefile. On one of our Linux
machines, we had set these Makefile variables to
MATLAB_HOME = /cs/local/generic/lib/pkg/matlab-7.3/bin/matlab
MEXSUFFIX = mexglx
Once you think you've set up the Makefile properly, type make
all
in the same directory as the Makefile. If you didn't get
any errors, then you're pretty much all set to go!
There's a great possibility you will encounter problems with the
installation instructions we have just described here. I'm afraid some
resourcefulness will be required on your part, as the installation
will be slightly different for each person. Please consult the troubleshooting section on this webpage, and the
archives
of the IPOPT mailing list. If you can't find the answer at either of
these locations, try sending an email to the IPOPT
mailing list.
Finally. If the installation procedure was successful, you
will end up with a MEX file. On a Linux machine, the MEX file will be
called ipopt.mexglx
. In order to use it in MATLAB, you
need to tell MATLAB where to find it. The best way to do this is to
type
addpath sourcedir
in the MATLAB command prompt, where sourcedir
is the
location of the MEX file you created. It is basically the full
pathname that ends in Ipopt/contrib/MatlabInterface
. You
can also achieve the same thing by modifying the
MATLABPATH
environment variable in the UNIX command line,
using either the export
command (in Bash shell), or the
setenv
command (in C-shell).
A note on 64-bit platforms. Starting with version 7.3,
MATLAB can handle 64-bit addressing, and the authors of MATLAB have
modified the implementation of sparse matrices to reflect this change.
However, the row and column indices in the sparse matrix are converted to
signed integers, and this could potentially cause problems when dealing
with large, sparse matrices on 64-bit platforms with MATLAB version
7.3 or greater.
Tutorial by example
Let's go through four examples which demonstrate the principal
features of the MATLAB interface to IPOPT. For additional information,
you can always type help ipopt
in the MATLAB prompt. The
tutorial exampels are all located in the directory
Ipopt/contrib/MatlabInterface/examples
.
Example 1
First, let's look at the Hock & Schittkowski test problem #51.1 It is an optimization problem with
5 variables, no inequality constraints and 3 equality
constraints. There is a MATLAB script examplehs051.m
which runs the limited-memory BFGS (Broyden-Fletcher-Goldfarb-Shanno)
algorithm with the starting point [2.5 0.5 2 -1 0.5]
and
obtains the solution [1 1 1 1 1]
. The line in the script
which executes the IPOPT solver is
x = ipopt(x0,lb,ub,lbc,ubc,@computeObjective,@computeGradient,...
@computeConstraints,@computeJacobian,'',[],'',[],...
'jac_c_constant','yes','hessian_approximation',...
'limited-memory','mu_strategy','adaptive','tol',1e-7);
The first input is the initial point. The second and third inputs
specify the lower and upper bounds on the variables. Since there are
no such bounds, we set the entries of these two vectors to
-Inf
and +Inf
. The fourth and fifth inputs
specify the lower and upper bounds on the 3 constraint functions. The
next five inputs specify the function handles to the required callback
routines. We've written the callback routines as subfunctions in the
same M-file. Since we are using a limited-memory approximation to the
Hessian, we don't need to know the values of the second-order partial
derivatives, so we set the Hessian callback routine to the empty
string. The rest of the input arguments set some options for the
solver, as detailed in the IPOPT documentation.
If you examine the functions computeObjective
and
computeGradient
, you will see that computing the
objective function and gradient vector is relatively
straightforward. The function computeConstraints
returns
a vector of length equal to the number of constraint functions. The
callback function computeJacobian
returns an M x N sparse
matrix, where M is the number of constraint functions and N is the
number of variables. It is important to always return a sparse matrix,
even if there is no computational advantage in doing so. Otherwise,
MATLAB will report an error.
Example 2
Let's move to the second example, examplehs038.m
. It
demonstrates the use of IPOPT on an optimization problem with 4
variables and no constraints other than simple bound constraints. This
time, we've implemented a callback routine for evaluating the
Hessian. The Hessian callback function takes as input the current
value of the variables x
, the factor in front of the
objective term sigma
, an the values of the constraint
multipliers lambda
(which in this case is empty). If the
last input is true, then the callback routine must return a sparse
matrix that has zeros in locations where the second-order derivative
at that location will ALWAYS be zero. The return value H must always
be a lower triangular matrix (type help tril
). As
explained in the IPOPT documentation, the Hessian matrix is symmetric
so the information contained in the upper triangular part is
redundant.
This example also demonstrates the use an iterative callback
function, which can be useful for displaying the status of the
solver. This is specified by the twelfth input. The function
callback
takes as input the current iteration
t
, the current value of the objective f
, and
the current point x
.
Example 3
The third slightly more complicated example script is
examplehs071.m
, which is the same as the problem explored
in the IPOPT documentation (Hock and Schittkowski test problem
#71). It is worth taking a peek at the functions
computeHessian
and computeJacobian
. In the
Hessian callback function, we make use of the input lambda. Since the
Hessian is dense, its structure is returned with the line
H = sparse(tril(ones(n)))
where n
is the number of variables. The Jacobian is
dense, so we can return its structure in a single line:
J = sparse(ones(m,n))
where m
is the number of constraint functions.
This example also differs from previous ones because the initial values for
the Lagrange multipliers are specified in MATLAB. We need to input three sets
of multipliers to IPOPT: the Lagrange multipliers corresponding to the lower
bounds on the optimization variables, the multipliers corresponding to the
uppwer bounds on the variables, and the multipliers associated with the
constraint functions. To specify these three sets of Lagrange multipliers, we
fill in three fields in the multipliers
struct as follows:
multipliers.zl = [1 1 1 1];
multipliers.zu = [1 1 1 1];
multipliers.lambda = [1 1];
Note that this optimization problem has 4 variables and 2 constraints.
In this example script, we have also chosen to output the the values of the
Lagrange multipliers that compose the dual portion of the final primal-dual
solution. When I ran this script, I obtained final values of approximately
multipliers.zl = [1 0 0 0];
multipliers.zu = [0 0 0 0];
multipliers.lambda = [-0.55 0.16];
The third output in the call to IPOPT is the number of iterations IPOPT
takes to converge to the stationary point within the specified tolerance.
Example 4
The last example is the script examplelauritzen.m
in
the subdirectory bayesnet
. It is vastly more complicated
than the other three. It pertains to research in inference in
probabilistic models in artificial intelligence. This script
demonstrates the problem of inferring the most probable states of
random variables given a Bayesian network.2 In this case, the model represents
the interaction between causes (e.g. smoking) and diseases (e.g. lung
cancer). We are given a patient that is a smoker, has tested positive
for some X-ray, and has recently visited Asia. In many ways this is a
silly and highly overused example, but it suits our needs here because
it demonstrates how to treat inference as an optimization problem, and
how to solve this optimization problem using IPOPT. This code should
NOT be used to solve large inference problems because it is not
particularly efficient.
The call to IPOPT is buried in the file bopt.m
. It is
[qR qS] = ipopt({qR qS},{repmat(eps,nqr,1) repmat(eps,nqs,1)},...
{ repmat(inf,nqr,1) repmat(inf,nqs,1) },...
[ ones(1,nr) ones(1,ns) zeros(1,nc) ],...
[ ones(1,nr) ones(1,ns) zeros(1,nc) ],...
@computeJGObjective',@computeJGGradient',...
@computeJGConstraints',@computeJGJacobian',...
@computeJGHessian',{ K C f Rv Rf Sv Sf NS d },'',...
...
As you can see, it is rather complicated! The first input, as
usual, is the starting point. (The variables actually represent
probability estimates.) Notice that we are passing a cell array, and
each entry of the cell array is a matrix. Likewise, the bound
constraints are specified as cell arrays. This is permitted as long as
the starting point and the bound constraints have the same
structure. If not, the MATLAB will report an error. (Note that the
lower bounds on the variables are set to floating-point
precision. Type help eps
. In this way, we ensure that the
logarithm of a variable never evaluates to zero.)
This cell array syntax is useful when your program has several
different types of variables. These sets of variables are then passed
as separate input arguments to the MATLAB callback functions. For
instance, qR
and qS
are passed as separate
arguments to the objective callback function in the M-file
computeJGObjective.m
. The entries of the cell array are
also treated as separate outputs from the gradient callback function
(see the M-file computeJGGradient.m
), and from the main
call to ipopt.
In this example, the constraint functions are all linear, so they have
no impact on the value of the Hessian. In fact, the Hessian is a
diagonal matrix (but not positive definite). The Jacobian can be
extremely large, but it is also very sparse; the number of entries is
a multiple of the number of variables.
This example also demonstrates the use of auxiliary data. In
bopt.m
, notice that the input after
computeJGHessian
is a cell array. This cell array is
passed as input to every MATLAB callback routine.
The tutorial is over!
Notes on implementation of the MATLAB interface
We won't bore you with all the details of the implementation. We
would, however, like to briefly point out a few of them. Most of the
issues of interest surround the representation of sparse matrices.
The MATLAB interface will necessarily be slower than the standard
C++ interface to IPOPT. That's because MATLAB dynamically allocates
new memory for all the outputs passed back from a function. Thus, for
large problems each iteration of IPOPT will involve the dynamic
allocation and deallocation of large amounts of memory.
Sparse matrices. The greatest challenge was most definitely
the conversion of sparse matrices from MATLAB to IPOPT. Sparse
matrices are used to represent the Jacobian of the constraint
functions and the Hessian of the Lagrangian function. There is a very
nice document by
Gilbert, Moler and Schreiber that discusses the design and
implementation of sparse matrices in the MATLAB environment. The
problem is that IPOPT assumes a static sparse matrix structure, but in
MATLAB there is no way to ensure that the size of the matrix (the
number of non-zero elements) does not change over time; if an entry of
a sparse matrix is set to zero, then the arrays are automatically
adjusted so that no storage is expended for that entry. This may seem
like a highly inefficient way to implement sparse matrices, and indeed
it is. However, Gilbert, Moler and Schreiber emphasize efficient
matrix-level operations over efficient element-level operations.
We can legitimately make the following assumption: the non-zero
entries of a sparse matrix passed back from MATLAB are a
subset of the non-zero entries in IPOPT's respective
sparse matrix.
The class SparseMatrixStructure
keeps track of the
structure of a sparse MATLAB matrix. It does not store the values of
the non-zero entries. We use it for the Jacobian of the constraints
and the Hessian of the Lagrangian. Even though these two matrices are
fundamentally different (one is square and lower triangular, the other
is rectangular), we can treat their structures in the same way.
The principal functions of interest in class
SparseMatrixStructure
are getColsAndRows
and
copyElems
. The function getColsAndRows
converts the MATLAB sparse matrix format into the equivalent IPOPT
format. The function copyElems
copies the entries from
one sparse matrix to another when one sparse matrix has a different
structure than the other. Obviously, for the copy operation to be
plausible, the set of non-zero entries of the destination matrix must
be a superset of the non-zero entries of the source matrix. Due to
efficiency concerns, no checks are made to ensure that this is
satisfied; it is up to the user to ensure that a sparse matrix passed
back to IPOPT never has a non-zero entry that was not declared
initially when returnStructureOnly = true
. In my
implementation, I have gone through great pains to ensure: 1. that the
copy function only makes one pass through the entries of the sparse
matrix, and 2. that there are no if statements inside the for loops,
which can severely impinge on the speed of the copy operation.