Date help created: 08 Jan 1994 Date last updated: 01 Nov 2001'process' is a program that allows the processing of multi-dimensional NMR data, typically starting with fid's and finishing with spectra. It includes the ability to do Fourier transforms, phasing, baseline correction, etc.
The input data does not have to be blocked. The input data cannot be deflated.
To run the program type
process [<memory in Mwords>] <process script file>
The <memory in Mwords> is optional. By default 64 Mwords (256 Mbytes) are allocated for the main storage. The more storage that is allocated the less i/o to and from disk is required, in general. (The routine 'scripts_processed' in the source file 'store.c' gives the exact requirement.)
The process script file must have the following format:
input <par file of input data file> output <output (processed) data file> par <par file of output data file> ! this is optional exclude <dimension> ! this is optional for every dim ! scripts are not run for excluded dimensions
followed by one or more lines of the form
script <dimension> <command file>
or
script_com <dimension> [ 'in-line' commands; see below ] end_script
for 1-dimensional processing, or
<command> <dimensions> <one or more parameters>
for multi-dimensional processing.
Here <dimension> is the dimension to which the <command file> should be applied. Each command file (or group of 'in-line' commands) has one or more lines of the form
<command> <one or more parameters>
The dimension does not have to be given here because it is specified with 'script' or 'script_com'. Thus more than one dimension can use the same command file. Also, a given dimension does not have to be processed, or can use more than one command file (or more than one group of 'in-line' commands).
Note that the data is assumed to be real by default. Also, points start counting from 1, not 0. And points are counted in terms of complex points, if the data is complex.
The list of commands can be obtained by typing
process help commands
It is recommended that commands be tested out on small data files (e.g. one row for 1-dimensional commands) if uncertain about use.
Some typical examples can be obtained by typing
process help scripts
The following is the complete list of 1-dimensional commands, grouped (approximately) according to function:
Syntax and more details about <command> (where <command> is any of the above commands) can be obtained by typing
process help <command>
Read the source code to determine the exact algorithm implemented.
'process' uses two types of files (in addition to par files and data files): script files and command files. Script files are similar in nature to those for other programs. Command files are similar to script files, but are specifically meant for processing 1-dimensional commands. The examples below should provide ample illustration. The script and command files are indented for clarity.
The same command file can be used for processing in more than one dimension. Not all dimensions need to have a command file. Any dimension can have more than one command file. These possibilities are all illustrated below.
Comments in files are everything in a line following the occurence of the character '!'.
Note, the data is assumed to be real by default.
A typical process script file for a 3-dim. data file might be
! typical script file that does processing in all 3 dims.
input /usr/people/wb104/edl387/edl387_5.bin.par output /usr/people/wb104/edl387/edl387_5.spc script 1 file1.com ! commands for dim. 1 script 2 file2.com ! commands for dim. 2 script 3 file2.com ! commands for dim. 3
where 'file1.com' might be
! file1.com ! typical command file for acquisition dimension
complex ! data is complex conv_sine 8 ! convolve with a half width of 8 points ! do this to remove water signal sinebell 90 ! do a 90 degree sinebell weighting zerofill 1 ! zerofill once fft ! (complex) Fourier transform reduce ! reduce to real upper 512 ! only save the first 512 points
and 'file2.com' might be
! file2.com ! typical command file for non-acquisition dimensions
complex ! data is complex sinebell 90 ! do a 90 degree sinebell weighting zerofill 1 ! zerofill once fft ! (complex) Fourier transform phase 90 -180 ! phase with phase0 = 90 and phase1 = -180 reduce ! reduce to real
This script file uses the commands in 'file1.com' for processing dimension 1, and those in 'file2.com' for dimensions 2 and 3. The commands in a given command file are processed sequentially.
All of the rows in a given dimension are processed, which eliminates the need for an explicit looping mechanism.
If baseline correction is needed in the third dimension then the original process script file could be
! typical script file that does processing in all 3 dims.
input /usr/people/wb104/edl387/edl387_5.bin.par output /usr/people/wb104/edl387/edl387_5.spc script 1 file1.com script 2 file2.com script 3 file2.com script 3 file3.com
where 'file3.com' might be
! file3.com ! only does baseline correction
base_poly 4 2 ! use window of half-width 4 ! use polynomial of degree 2
! base_poly2 4 2 1 32 ! this would fit only pts 1 to 32
If baseline correction is needed in dims. two and three then the following process script file would work
! typical script file that does processing in all 3 dims.
input /usr/people/wb104/edl387/edl387_5.bin.par output /usr/people/wb104/edl387/edl387_5.spc script 1 file1.com script 2 file2.com script 3 file2.com script 2 file3.com script 3 file3.com
An example of another process script file is
! script file that does processing in dim. 1 only
input /usr/people/wb104/edl387/edl387_5.bin.par output /usr/people/wb104/edl387/edl387_5.fft1 script 1 file1.com
where 'file1.com' is the same as above, and then to finish the processing another possible process script file might be
! script file that does maxent processing in dims. 2 and 3
input /usr/people/wb104/edl387/edl387_5.fft1.par output /usr/people/wb104/edl387/edl387_5.max2 maxent2 2 3 max2.dat
Here 'max2.dat' would be the 2-dim. maximum entropy script file. More information on maxent script files can be obtained by typing
process help maxents
Alternatively, all the processing could be done at once by using
! script file that does processing in all dims. ! including maxent in dims. 2 and 3
input /usr/people/wb104/edl387/edl387_5.bin.par output /usr/people/wb104/edl387/edl387_5.max2 script 1 file1.com maxent2 2 3 max2.dat
Another process script file for the same data file, doing maximum entropy only in the third dimension might be
! script file that does processing in all dims. ! including maxent in dim. 3
input /usr/people/wb104/edl387/edl387_5.bin.par output /usr/people/wb104/edl387/edl387_5.max script 1 file1.com script 2 file2.com maxent 3 max.dat
Here 'max.dat' would be the 1-dim. maximum entropy script file.
Alternatively, the same processing could be accomplished by
! script file that does processing in all dims. ! including maxent in dim. 3
input /usr/people/wb104/edl387/edl387_5.bin.par output /usr/people/wb104/edl387/edl387_5.max script 1 file1.com script 2 file2.com script 3 file4.com
where 'file4.com' would be
! file4.com ! maximum entropy processing
maxent max.dat ! maximum entropy processing
complex
This says that the data is complex. There are no parameters. [ source code = complex.c ]
real
This says that the data is real (this is the default). There are no parameters. [ source code = complex.c ]
complexify
This complexifies real data by setting the imaginary part to 0. There are no parameters. [ source code = complex.c ]
reduce
This reduces complex data to real data by throwing out the imaginary part. There are no parameters. [ source code = complex.c ]
exchange
This exchanges the real and imaginary parts of complex data. There are no parameters. [ source code = complex.c ]
conjugate
This takes the (complex) conjugate of complex data. There are no parameters. [ source code = complex.c ]
magnitude
This takes the magnitude of the data. For real data, this is the absolute value of the data. For complex data this is the square root of the sum of the squares of the real and imaginary part of the data. There are no parameters. [ source code = complex.c ]
magnitude2
This takes the magnitude squared of the data. For real data, this is the square of the data. For complex data this is the the sum of the squares of the real and imaginary part of the data. There are no parameters. [ source code = complex.c ]
lower <lower bound>
This shifts data so that the origin is now at <lower bound>. If the data is complex, <lower bound> should be specified in terms of complex points. Note: after use of lower, point <lower bound> becomes point 1. This can cause confusion if lower is followed by upper. If both lower and upper need to be used then it is recommended that range be used instead. As an example, 'lower 100' makes point 100 the new point 1. [ source code = arrange.c ]
upper <upper bound>
This truncates data so that its last point is <upper bound>. If the data is complex, <upper bound> should be specified in terms of complex points. As an example, if the data has 1024 points (real or complex) and only the first 512 are to be kept, then use 'upper 512'. [ source code = arrange.c ]
range <lower bound> <upper bound>
This shifts and truncates data so that the origin is now at <lower bound> and so that the last point is <upper bound>. If the data is complex, <lower bound> and <upper bound> should be specified in terms of complex points. Note: 'range 131 400' is equivalent to 'upper 400' followed by 'lower 131' or 'lower 131' followed by 'upper 270' (*not* 'upper 400'). In this example, there are 270 points left after the range. [ source code = arrange.c ]
shift <shift amount>
This shifts the data by the positive amount <shift amount>. This shifts the data in the opposite direction to lower. Thus a <shift amount> of 1 means that new_pt[2] = old_pt[1]. The points at the left of the new data are made 0. shift does not remove the points at the right, it makes the data longer. Use upper if it is desired to remove the points at the right. If the data is complex, <shift amount> should be specified in terms of complex points. As an example, 'shift 10' shifts the data to the right by 10 points (real or complex). [ source code = arrange.c ]
cycle <cycle amount>
This cycles the data by <cycle amount>. If the data is complex, <cycle amount> should be specified in terms of complex points. A <cycle amount> of 1 means that new_pt[2] = old_pt[1]. A <cycle amount> of -1 means that new_pt[1] = old_pt[2]. As an example, 'cycle 10' cycles the data so that the old point 1 becomes the new point 11. [ source code = arrange.c ]
reverse
This reverses the data. If there are n (complex/real) points then, for example, pt[1] and pt[n] are swapped. There are no parameters. [ source code = arrange.c ]
zerofill <n>
This zero fills the data so that its final size is 2 to the power <n> times the original size (this is true independently of whether the original size is itself a power of 2). <n> must be <= 4. As an example, 'zerofill 1' will double the size of the data, and would be a typical use of 'zerofill'. [ source code = arrange.c ]
scale <first point> <last point> <value>
This multiplies each point between <first point> and <last point> (inclusive) of the data by <value>. The data must be real, for complex data use scale2. As an example, 'scale 1 512 10.0' multiplies the points between 1 and 512 (inclusive) by a factor of 10. [ source code = arrange.c ]
scale2 <first point> <last point> <real value> <imaginary value>
This multiplies each point between <first point> and <last point> (inclusive) of the data by <real value> + i <imaginary value>. Note that this is complex multiplication, in particular this does not multiply the real points by <real value> and the imaginary points by <imaginary value>. The data must be complex, for real data use scale. <first point> and <last point> must be specified in terms of complex points. As an example, 'scale2 1 512 0.0 1.0' multiplies the points between 1 and 512 (inclusive) by a factor of i. As another example, 'scale2 1 1 0.5 0.0' multiplies the first (complex) point by a factor of 0.5. [ source code = arrange.c ]
set <first point> <last point> <value>
This sets each point between <first point> and <last point> (inclusive) of the data to be <value>. The data must be real, for complex data use set2. As an example, 'set 1 512 1.0' sets the data between points 1 and 512 (inclusive) to be 1. [ source code = arrange.c ]
set2 <first point> <last point> <real value> <imaginary value>
This sets each point between <first point> and <last point> (inclusive) of the data to be <real value> + i <imaginary value>. The data must be complex, for real data use set. <first point> and <last point> must be specified in terms of complex points. As an example, 'set2 1 512 1.0 0.0' sets the data between points 1 and 512 (inclusive) to be 1. [ source code = arrange.c ]
mask_mp <period>
This multiplies points 1, 2, ..., period/2, and all other points equivalent to these modulo <period>, by -1. This is not allowed on complex data. The <period> must be an even number. As an example, 'mask_mp 4' multiplies points 1, 2, and all other points equivalent to these modulo 4, by -1. [ source code = arrange.c ]
mask_pm <period>
This multiplies points 1+period/2, 2+period/2, ..., period, and all other points equivalent to these modulo <period>, by -1. This is not allowed on complex data. The <period> must be an even number. As an example, 'mask_pm 4' is equivalent to mask_ppmm. [ source code = arrange.c ]
mask_ppmm
This is the ++-- mask needed to process some Bruker data. This multiplies points 3 and 4, and all other points equivalent to these modulo 4, by -1. This is not allowed on complex data. There are no parameters. [ source code = arrange.c ]
mirror_zero
This mirrors the data for zero dwell. This means that the data is extended in negative time by complex conjugating and reversing the existing data, excluding the first point. The data must be complex. There are no parameters. [ source code = arrange.c ]
mirror_half
This mirrors the data for half dwell. This means that the data is extended in negative time by complex conjugating and reversing the existing data, including the first point. The data must be complex. There are no parameters. [ source code = arrange.c ]
riri2rrii
This converts complex data for which the ordering is riri... (the Azara default) to rr...ii....
rrii2riri
This converts complex data for which the ordering is rr...ii... to riri... (the Azara default).
decay <end value>
This multiplies the data by a decaying exponential, which, if there are n (complex/real) points, is 1 at point 1 and <end value> at point n. As an example, if n = 128, 'decay 0.5' multiplies the data by a decaying exponential which is 0.5 at point 128. And if n = 64, 'decay 0.75' multiplies the data by a decaying exponential which is 0.75 at point 64. See also decay_sw, which has more intuitive usage. [ source code = weight.c ]
decay_sw <line broadening (Hz)> <spectral width (Hz)>
This multiplies the data by a decaying exponential, exp(-pi*LB*t), where LB is the entered line broadening (in Hz). This function will provide a matched filter (=> best possible S/N ratio) for lines (or multiplets) that are LB Hz wide, and will increase all linewidths by LB Hz. The spectral width (SW) is needed by the program in order to translate between point number, x, and the time value, t, used in the function, by t = x/(2*SW). If a value of SW=0.0 is given, the program uses the value for SW which was entered in the par file (or the default). As an example, 'decay_sw 10 8500' has a line broadening of 10 Hz for a spectral width of 8500 Hz. [ source code = weight.c ]
gaussian <one fraction> <end value>
This multiplies the data by the gaussian exp(a + b*x + c*x*x), where a, b and c are determined so that, if there are n (complex/real) points, the gaussian is 1, its maximum value, at point 1 + <one fraction>*(n-1) and <end value> at point n. As an example, if n = 128, 'guassian 0.25 0.75' multiplies the data by a gaussian which is 1 at point 31.75 and which is 0.75 at point 128. See also gaussian_sw, which has more intuitive usage. [ source code = weight.c ]
gaussian_sw <line broadening (Hz)> <sharpening factor> <spectral width (Hz)>
This multiplies the data by the gaussian exp(a + b*t + c*t*t). If LB is the entered line broadening (in Hz) and s is the sharpening factor, then a = -ln2 / s^2, b = pi*LB and c = - (pi*LB*s)^2 / 4ln2. Note that LB must be *positive* for normal use, which is the *opposite* of the Bruker convention. The function converts a Lorentzian line (or multiplet) of width FWWH equal to LB to a Gaussian line of width FWWH equal to s * LB. The maximum value of the multiplying function is 1, which is obtained at the fraction of the acquired time t/T = 2ln2 / (LB*pi*T*s^2). The spectral width (SW) is needed by the program in order to translate between point number, x, and the time value, t, used in the function, by t = x/(2*SW). If a value of SW=0.0 is given, the program uses the value for SW which was entered in the par file (or the default). The suggested range of values for s are from 1.3 to 0.7 (possibly 0.5). Relative S/N ratio after processing (with S/N after matched filter = 100%) are ca. 88% for s=1.3, 75% for s=1, 44% for s=0.7 and 13% for s=0.5. Note that values of s>1 can still be advantageous, as they improve the S/N and as the resulting Gaussian will have narrower 'feet' than the original Lorentzian. See Ernst, Bodenhausen and Wokaun, section 4.1.3.2, or Ernst, Adv. Mag. Reson. 2 (1996) p. 1. As an example, 'gaussian_sw 10 0.7 8500' has a line broadening of 10 Hz for a spectral width of 8500 Hz, and with a sharpening factor of 0.7. [ source code = weight.c ]
sinebell <angle>
This multiplies the data by a sine function with the given <angle> (specified in degrees) at point 1, and, if there are n (complex/real) points, with angle 180 degrees at point n+1 (*not* point n). As an example, 'sinebell 90' would be a typical use. This multiplies the data by a sine function which is 1 at point 1 and 0 at point n+1. [ source code = weight.c ]
sinebell2 <angle>
This multiplies the data by a sine function squared with the given <angle> (specified in degrees) at point 1, and, if there are n (complex/real) points, with angle 180 degrees at point n+1 (*not* point n). As an example, 'sinebell2 90' would be a typical use. This multiplies the data by the square of a sine function which is 1 at point 1 and 0 at point n+1. [ source code = weight.c ]
inv_cosine <frequency> <spectral width>
This multiplies the data by the inverse cosine function 1 / cos(d*x) (protected against dividing by zero), where d = 0.5 * PI * <frequency> / <spectral width>. [ source code = weight.c ]
weight_file <weight file>
This multiplies the data by a function specified in the given <weight file>. If there are n (complex/real) points then there must be n values in <weight file> (in free format). As an example, 'weight_file edl387_5.wgt' would multiply the data by the values given in 'edl387_5.wgt'. [ source code = weight.c ]
convolve <half width>
This does the same thing as conv_sine and is considered to be obsolete. [ source code = convolve.c and conv.c ]
conv_sine <half width>
This convolves the data with a sine function with the given <half width>, and subtracts the result from the data. Typically this is done on the acquisition fid before anything else is done, in order to remove a water signal that occurs at zero frequency. As an example, 'conv_sine 8' would be a typical use if there were 1024 data points. [ source code = convolve.c and conv.c ]
conv_box <half width>
This convolves the data with a box function with the given <half width>, and subtracts the result from the data. (A box function is 1 inside the <half width> and 0 outside.) Typically this is done on the acquisition fid before anything else is done, in order to remove a water signal that occurs at zero frequency. As an example, 'conv_box 8' would be a typical use if there were 1024 data points. [ source code = convolve.c and conv.c ]
conv_triangle <half width>
This convolves the data with a triangle function with the given <half width>, and subtracts the result from the data. (A triangle function is 1 at the maximum and decreases linearly to 0 at distance 1+<half width> from the maximum.) Typically this is done on the acquisition fid before anything else is done, in order to remove a water signal that occurs at zero frequency. As an example, 'conv_triangle 8' would be a typical use if there were 1024 data points. [ source code = convolve.c and conv.c ]
conv_gaussian <half width> <end value>
This convolves the data with a gaussian function with the given <half width> and equal to <end value> at <half width> from the maximum, and subtracts the result from the data. (A gaussian function is of the form exp(-a*x*x).) Typically this is done on the acquisition fid before anything else is done, in order to remove a water signal that occurs at zero frequency. As an example, 'conv_gaussian 8' would be a typical use if there were 1024 data points. [ source code = convolve.c and conv.c ]
conv_file <convolve file> <half width>
This convolves the data with a function specified in the given <convolve file> and with the given <half width>, and subtracts the result from the data. Typically this is done on the acquisition fid before anything else is done, in order to remove a water signal that occurs at zero frequency. The convolution function is assumed to be 1 at the center and symmetric, thus only <half width> values need to be specified in the <convolve file>. As an example, 'conv_file edl387_5.conv 9' would be a typical use if there were 1024 data points, and in this case 'edl387_5.conv' might contain
0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1
(giving a triangular convolution function). [ source code = convolve.c and conv.c ]
phase <phase0> <phase1>
This phases (complex) data with specified parameters <phase0> and <phase1> (both specified in degrees). (The pivot is assumed to be the first point.) This command is equivalent to 'phase2 <phase0> <phase1> 1'. The data must be complex. As an example, 'phase 90 -180' would be a typical use. [ source code = phase.c ]
phase2 <phase0> <phase1> <pivot>
This phases (complex) data with specified parameters <phase0> and <phase1> (both specified in degrees), and pivot <pivot> (the first point being point 1). The data must be complex. As an example, 'phase2 9 0 35' would be phasing with <phase0> = 9, <phase1> = 0 and <pivot> = 35. [ source code = phase.c ]
fft
This does a complex Fourier transform. The input data size does not have to be a power of 2 but the data is zero padded if it is not. The data must be complex. There are no parameters. [ source code = fft.c and ft.c ]
ifft
This does an inverse complex Fourier transform. The input data size does not have to be a power of 2 but the data is zero padded if it is not. The data must be complex. There are no parameters. [ source code = fft.c and ft.c ]
rft
This does a real Fourier transform. The input data size does not have to be a power of 2 but the data is zero padded if it is not. The data must be real. There are no parameters. [ source code = fft.c and ft.c ]
irft
This does an inverse real Fourier transform. The input data size does not have to be a power of 2 but the data is zero padded if it is not. The data must be complex. There are no parameters. [ source code = fft.c and ft.c ]
cft
This does a cosine Fourier transform. The input data size does not have to be a power of 2 but the data is zero padded if it is not. The data must be real. There are no parameters. [ source code = fft.c and ft.c ]
icft
This does an inverse cosine Fourier transform. The input data size does not have to be a power of 2 but the data is zero padded if it is not. The data must be real. There are no parameters. [ source code = fft.c and ft.c ]
sft
This does a sine Fourier transform. The input data size does not have to be a power of 2 but the data is zero padded if it is not. The data must be real. There are no parameters. [ source code = fft.c and ft.c ]
isft
This does an inverse sine Fourier transform. The input data size does not have to be a power of 2 but the data is zero padded if it is not. This command is equivalent to sft. The data must be real. There are no parameters. [ source code = fft.c and ft.c ]
hft
This does a Hilbert Fourier transform. The input data size does not have to be a power of 2 but the data is zero padded if it is not. The data must be real. There are no parameters. [ source code = fft.c and ft.c ]
fftn
This does a normalised complex Fourier transform. The input data size does not have to be a power of 2 but the data is zero padded if it is not. The data must be complex. There are no parameters. [ source code = fft.c and ft.c ]
ifftn
This does a normalised inverse complex Fourier transform. The input data size does not have to be a power of 2 but the data is zero padded if it is not. The data must be complex. There are no parameters. [ source code = fft.c and ft.c ]
rftn
This does an approximately normalised real Fourier transform. The input data size does not have to be a power of 2 but the data is zero padded if it is not. The data must be real. There are no parameters. [ source code = fft.c and ft.c ]
irftn
This does an approximately normalised inverse real Fourier transform. The input data size does not have to be a power of 2 but the data is zero padded if it is not. The data must be complex. There are no parameters. [ source code = fft.c and ft.c ]
cftn
This does an approximately normalised cosine Fourier transform. The input data size does not have to be a power of 2 but the data is zero padded if it is not. The data must be real. There are no parameters. [ source code = fft.c and ft.c ]
icftn
This does an approximately normalised inverse cosine Fourier transform. The input data size does not have to be a power of 2 but the data is zero padded if it is not. The data must be real. There are no parameters. [ source code = fft.c and ft.c ]
sftn
This does a normalised sine Fourier transform. The input data size does not have to be a power of 2 but the data is zero padded if it is not. The data must be real. There are no parameters. [ source code = fft.c and ft.c ]
isftn
This does a normalised inverse sine Fourier transform. The input data size does not have to be a power of 2 but the data is zero padded if it is not. This command is equivalent to sftn. The data must be real. There are no parameters. [ source code = fft.c and ft.c ]
avance <DSPFVS> <DECIM>
This does the first part of the transform required for Bruker Avance data, with specified <DSPFVS>, <DECIM> and angle 180 degrees. This does a left shift by a number of points calculated from <DSPFVS> and <DECIM>. For the second part of the transform use avance_phase, which is required in combination with this command. This command would normally be done before convolution or weighting. For angle not 180 use avance2. As an example, 'avance 12 24' has DSPFVS=12 and DECIM=24. [ source code = avance.c and avance_param.c ]
avance2 <DSPFVS> <DECIM> <angle>
This does the first part of the transform required for Bruker Avance data, with specified <DSPFVS>, <DECIM> and <angle>. This does a left shift by a number of points calculated from <DSPFVS>, <DECIM> and <angle>. For the second part of the transform use avance_phase, which is required in combination with this command. This command would normally be done before convolution or weighting. For angle 180 use avance. As an example, 'avance2 12 24 90' has DSPFVS=12, DECIM=24 and angle=90. [ source code = avance.c and avance_param.c ]
avance_phase
This does the second part of the transform required for Bruker Avance data. This does a phasing by piv = 1 and ph0 and ph1 calculated from the parameters as specified in avance or avance2, which must accompany this command. This command would normally be done just after the Fourier transform and before the usual phasing, and it must be done inside the same script as the first part of the transform. [ source code = avance.c and avance_param.c ]
base_const <half width>
This fits the baseline of a spectrum (not fid) by first finding baseline points using a shifting window of half width <half width> and then fitting a constant to those baseline points. The data must be real. If base_points has been used to define the baseline then the <half width> is ignored. This command is equivalent to 'base_poly <half width> 0'. As an example, 'base_const 4' would fit a constant baseline, using a half-width of 4 points. [ source code = baseline.c and base.c ]
base_const2 <half width> <first point> <last point>
This fits the baseline of a spectrum (not fid) by first finding baseline points using a shifting window of size <half width> and then fitting a constant to those baseline points. Only the points from <first point> to <last point> (inclusive) are fitted. The data must be real. If base_points has been used to define the baseline then the <half width> is ignored. This command is equivalent to 'base_poly2 <half width> 0 <first point> <last point>'. As an example, 'base_const2 4 1 512' would fit a constant baseline between points 1 and 512 (inclusive), using a half-width of 4 points. [ source code = baseline.c and base.c ]
base_poly <half width> <degree>
This fits the baseline of a spectrum (not fid) by first finding baseline points using a shifting window of size <half width> and then fitting a polynomial of degree (order) <degree> to those baseline points. The data must be real. If base_points has been used to define the baseline then the <half width> is ignored. If <degree> = 0 this command is equivalent to 'base_const <half width>'. As an example, 'base_poly 4 2' would fit the baseline using a second-order polynomial (i.e. a parabola) and a half-width of 4 points. [ source code = baseline.c, base.c and svd.c ]
base_poly2 <half width> <degree> <first point> <last point>
This fits the baseline of a spectrum (not fid) by first finding baseline points using a shifting window of size <half width> and then fitting a polynomial of degree (order) <degree> to those baseline points. Only the points from <first point> to <last point> (inclusive) are fitted. The data must be real. If base_points has been used to define the baseline then the <half width> is ignored. If <degree> = 0 this command is equivalent to 'base_const2 <half width> <first point> <last point>'. As an example, 'base_poly2 4 2 1 512' would fit the baseline between points 1 and 512 (inclusive) using a second-order polynomial (i.e. a parabola) and a half-width of 4 points. [ source code = baseline.c, base.c and svd.c ]
base_trig <half width> <order>
This fits the baseline of a spectrum (not fid) by first finding baseline points using a shifting window of size <half width> and then fitting trig functions of order <order> to those baseline points. In fact, the number of functions used to do the fitting is 2*<order> + 1, the 2 being because cosines and sines are used, and the 1 being because a constant is also used. This fit only makes sense if the original data set contains the entire recorded spectral width. The data must be real. If base_points has been used to define the baseline then the <half width> is ignored. If <order> = 0 this command is equivalent to 'base_const <half width>'. As an example, 'base_trig 4 2' would fit the baseline using five trig functions (1 constant, 2 cosines and 2 sines) and a half-width of 4 points. [ source code = baseline.c, base.c and svd.c ]
base_trig2 <half width> <order> <first point> <last point>
This fits the baseline of a spectrum (not fid) by first finding baseline points using a shifting window of size <half width> and then fitting trig functions of order <order> to those baseline points. In fact, the number of functions used to do the fitting is 2*<order> + 1, the 2 being because cosines and sines are used, and the 1 being because a constant is also used. Only the points from <first point> to <last point> (inclusive) are fitted. This fit only makes sense if the original data set contains the entire recorded spectral width. The data must be real. If base_points has been used to define the baseline then the <half width> is ignored. If <order> = 0 this command is equivalent to 'base_const2 <half width> <first point> <last point>'. As an example, 'base_trig2 4 2 1 512' would fit the baseline between points 1 and 512 (inclusive) using five trig functions (1 constant, 2 cosines and 2 sines) and a half-width of 4 points. [ source code = baseline.c, base.c and svd.c ]
base_points <base points file>
This uses the points in the <base points file> to define the baseline points for subsequent baseline correction routines (by default the baseline is determined automatically by the program). There must be a matching end_base_points before the next use of base_points or before a resumption of the automatic baseline determination. As an example, 'base_points edl387_5.base' might be used to define the baseline points. The file 'edl387_5.base' would need to contain the desired baseline points, using a free format. If the number of points was 512 then each point would need to be in the range (1, 512). No point can be repeated. For example, 'edl387_5.base' might contain
1 5 19 25 480 490 501 505 510
This base_points command might be followed by 'base_trig 4 2', for example, and then by 'end_base_points' (so that the same baseline points are not used in the baseline correction routines in subsequent command files).
end_base_points
This matches the most recent base_points and must occur before the next use of base_points or before a resumption of automatic baseline determination (the default).
base_subtract <base subtract file>
This subtracts the values given in <base subtract file> from the data. The data must be real. If there are n points then there must be n values in <base subtract file> (in free format). As an example, 'base_subtract edl387_5.sub' would subtract the values given in 'edl387_5.sub' from the data.
base_subtract2 <base subtract file> <first point> <last point>
This subtracts the values given in <base subtract file> from the data, from <first point> to <last point> (inclusive). The data must be real. If n = <last point> - <first point> + 1, then there must be n values in <base subtract file> (in free format). As an example, 'base_subtract2 edl387_5.sub 1 256' would subtract the values given in 'edl387_5.sub' from the data, from points 1 to 256 inclusive.
lp_extend <number predicted> <length of sequence> <cutoff>
If <number predicted> is positive this does a linear prediction of <number predicted> points starting from the end of the existing data. If <number predicted> is negative this does a linear prediction of -<number predicted> points inserting these at the beginning of the existing data (i.e. not overwriting what is already there). The data must be complex. The <number predicted> points and <length of sequence> are to be specified in terms of complex points. <cutoff> must be between 0 and 1. It gives the cutoff below which singular values are set to 0. It is recommended that <length of sequence> be between around MIN(n/4 to n/3, 15 to 20) where n is the number of complex points (in the data before linear prediction). (The algorithm does not need a <length of sequence> more than 15 to 20 and is unstable if this is too large.) It is recommended not to predict more than the number of points already in the data. A <cutoff> of around 0.0001 ought to work. The algorithm is based on the forward-backward method, see Zhu and Bax, JMR 100 (1992) 202-207. As an example, if there are originally 20 complex points in the data then 'lp_extend 20 6 0.0001' would predict forward another 20 points using a sequence of length 6 and a cutoff of 0.0001. As another example, if there are originally 20 complex points in the data then 'lp_extend -1 6 0.0001' would insert 1 initial point using a sequence of length 6 and a cutoff of 0.0001. [ source code = lp_extend.c, fblp.c, csvd.c, complex.c, poly_roots.c, and linpack directory ]
lp_forward <number predicted> <number of poles>
This does a linear prediction of <number predicted> points starting from the end of the existing data. The <number predicted> points is to be specified in terms of complex points if the data is complex. For complex data the real and imaginary data are predicted separately. The <number of poles> should be an upper bound of the number of oscillators in a given row. This should not be too large (e.g. 5 or 10). As an example, 'lp_forward 64 10' would predict forward 64 points using 10 poles. [ source code = lp_first.c, lp.c, lin_pred.c, complex.c and poly_roots.c ]
lp_backward <number predicted> <number of poles>
This does a linear prediction of the first <number predicted> points, which is to be specified in terms of complex points if the data is complex. For complex data the real and imaginary data are predicted separately. The number of points is not changed, instead the first <number predicted> points are overwritten. The <number of poles> should be an upper bound of the number of oscillators in a given row. This should not be too large (e.g. 5 or 10). As an example, 'lp_forward 64 10' would predict forward 64 points using 10 poles. [ source code = lp_first.c, lp.c, lin_pred.c, complex.c and poly_roots.c ]
lp_first <number predicted> <length of sequence> <number of sequences> <cutoff>
This does a linear prediction of the first <number predicted> points, using <number of sequences> sequences of the data of length <length of sequences> to do the prediction. All of these are to be specified in terms of complex points if the data is complex. The number of points is not changed, instead the first <number predicted> points are overwritten. <number of sequences> must be >= <length of sequence>. For complex data, each real and imaginary data point is fitted separately. For an inherently complex fitting of complex data, use lp_first2. The algorithm uses singular value decomposition (svd) and <cutoff> is the value which determines which of the singular values are significant. <cutoff> must be between 0 and 1. This algorithm is not very good, and it is highly recommended that various parameter choices be tested on 1-dimensional rows before an entire multi-dimensional data file is processed using this. The baseline correction commands might give better results. As an example, 'lp_first 1 8 10 0.001' would predict point 1 using 10 sequences of length 8, with a 0.001 cutoff. This means that points 2 through (10+8+1-1) = 18 are used to make the prediction, sequence 1 being points (2, 3, ..., 9) and sequence 10 being points (11, 12, ..., 18). [ source code = lp_first.c, lp.c and svd.c ]
lp_first2 <number predicted> <length of sequence> <number of sequences> <cutoff>
This does a linear prediction of the first <number predicted> points, using <number of sequences> sequences of the data of length <length of sequences> to do the prediction. All of these are to be specified in terms of complex points if the data is complex. The number of points is not changed, instead the first <number predicted> points are overwritten. <number of sequences> must be >= <length of sequence>. For complex data, each real and imaginary data point is fitted as one (complex) point. For real data, lp_first2 is exactly the same as lp_first. The algorithm uses singular value decomposition (svd) and <cutoff> is the value which determines which of the singular values are significant. <cutoff> must be between 0 and 1. This algorithm is not very good, and it is highly recommended that various parameter choices be tested on 1-dimensional rows before an entire multi-dimensional data file is processed using this. The baseline correction commands might give better results. As an example, 'lp_first2 1 8 10 0.001' would predict point 1 using 10 sequences of length 8, with a 0.001 cutoff. This means that points 2 through (10+8+1-1) = 18 are used to make the prediction, sequence 1 being points (2, 3, ..., 9) and sequence 10 being points (11, 12, ..., 18). [ source code = lp_first.c, lp.c and svd.c ]
lp_last <number predicted> <length of sequence> <number of sequences> <cutoff>
This does a linear prediction of the last <number predicted> points, using <number of sequences> sequences of the data of length <length of sequences> to do the prediction. All of these are to be specified in terms of complex points if the data is complex. The predicted points start from the first point past the current last data point. <number of sequences> must be >= <length of sequence>. For complex data, each real and imaginary data point is fitted separately. For an inherently complex fitting of complex data, use lp_last2. The algorithm uses singular value decomposition (svd) and <cutoff> is the value which determines which of the singular values are significant. <cutoff> must be between 0 and 1. This algorithm is not very good, and it is highly recommended that various parameter choices be tested on 1-dimensional rows before an entire multi-dimensional data file is processed using this. As an example, 'lp_last 5 8 10 0.001' would predict five points after the current last point, using 10 sequences of length 8, with a 0.001 cutoff. If the current number of points is 59, then points 60 through 64 are predicted, with points 43 through 59 used to make the prediction, sequence 1 being points (43, 44, ..., 50) and sequence 10 being points (52, 53, ..., 59). [ source code = lp_last.c, lp.c and svd.c ]
lp_last2 <number predicted> <length of sequence> <number of sequences> <cutoff>
This does a linear prediction of the last <number predicted> points, using <number of sequences> sequences of the data of length <length of sequences> to do the prediction. All of these are to be specified in terms of complex points if the data is complex. The predicted points start from the first point past the current last data point. <number of sequences> must be >= <length of sequence>. For complex data, each real and imaginary data point is fitted as one (complex) point. For real data, lp_last2 is exactly the same as lp_last. The algorithm uses singular value decomposition (svd) and <cutoff> is the value which determines which of the singular values are significant. <cutoff> must be between 0 and 1. This algorithm is not very good, and it is highly recommended that various parameter choices be tested on 1-dimensional rows before an entire multi-dimensional data file is processed using this. As an example, 'lp_last2 5 8 10 0.001' would predict five points after the current last point, using 10 sequences of length 8, with a 0.001 cutoff. If the current number of points is 59, then points 60 through 64 are predicted, with points 43 through 59 used to make the prediction, sequence 1 being points (43, 44, ..., 50) and sequence 10 being points (52, 53, ..., 59). [ source code = lp_last.c, lp.c and svd.c ]
maxent <maxent script file> (in command files) maxent <dim> <maxent script file> (in script files)
This does (1-dimensional) maximum entropy processing, using the script file <maxent script file> (in the dimension <dim>, if the command appears in a script file). The output data is real. For information about maxent script files, type process help maxents [ source code = maxent.c, opus.c and mem.c ]
maxent2 <dim1> <dim2> <maxent script file>
This does (2-dimensional) maximum entropy processing, using the script file <maxent script file> in the dimensions <dim1> and <dim2>. The output data is real. For information about maxent script files, type process help maxents [ source code = maxent.c, opus.c and mem.c ]
maxent3 <dim1> <dim2> <dim3> <maxent script file>
This does (3-dimensional) maximum entropy processing, using the script file <maxent script file> in the dimensions <dim1>, <dim2> and <dim3>. The output data is real. For information about maxent script files, type process help maxents [ source code = maxent.c, opus.c and mem.c ]
interlace <dim>
This unwraps interlaced experiments, with the relevant dimension being d2 = <dim>. It is implicit that the other dimension is d1 = 1. The ideal time dependence of a peak located at (w1, w2) is assumed to be exp(i w1 t1) exp(i w2 t2) in the odd experiments (starting the count for these at 1) and exp(i w1 t1) exp(- i w2 t2) in the even experiments. This data is replaced by exp(i w1 t1) cos(w2 t2) in the odd rows and exp(i w1 t1) sin(w2 t2) in the even rows. Other conventions (experiments) can be accomodated by complex conjugation. It would normally be the case that this is the first command done when processing interlaced data.
Maximum entropy processing has proved useful for producing spectra in those dimensions that have few points. It replaces completely the usual processing on those dimensions in which it is used.
Unlike other commands, only one maximum entropy processing is allowed per use of the program 'process' (it would not make sense to do more than one).
Currently there are 1-, 2- and 3-dimensional maxent algorithms, and the input data must be complex in each of the dimensions. A 'slice' of data is a 1-, 2- or 3-dim. subset of the data file.
Extensions could be written by writing new 'opus' and 'tropus' routines (see opus.c) but it is not recommended without expert advice.
Typical uses would be 1-dimensional or 2-dimensional maximum entropy (maxent) for a 3-dimensional data file, or 2-dimensional or 3-dimensional maxent for a 4-dimensional data file.
It is highly recommended that the maxent algorithm is tested on a slice of data before it is run on the entire data file, to make sure that convergence occurs for the choice of parameters.
The 2- and especially the 3-dimensional maxent algorithms can easily lead to heavy demands on i/o to and from disk, but it is usually the case that the maxent processing itself dominates the processing time.
Maximum entropy processing requires the use of a script file. Some of the commands must appear, but most are optional (these have defaults). Some of the commands are independent of dimension, and these must appear at the top of the file before the ones that depend on dimension.
Necessary commands at the top of the script file:
There are none.
Optional commands at the top of the script file:
iter <maximum number>
This specifies the maximum number of iterations that the algorithm on one slice of data will use before giving up. Normally, the algorithm should converge in fewer iterations than this, and if it does not something is probably wrong. By default, <maximum number> = 40 (this should be adequate for 1-dimension, for 2- or 3-dimensions, 10-20 or so should work). As the number of iterations increases, the algorithm gets very slow.
positive
This indicates that the spectrum has positive signal only. Even if this is true, it often is best to leave this out, the algorithm seems to converge better with negative signal allowed. By default, the spectrum is assumed not to be positive.
noise <noise level>
This gives the noise level of the *input* data (not the spectrum). It is probably best to experiment with this on a data slice. By default, <noise level> = 100, but this is unlikely to be correct.
log <log file>
This means that information provided by the algorithm will be written to <log file> for every iteration of every data slice. By default, there is no log file.
This information includes grade: This should be near 0 if all is going well, except on iteration 0. If it gets near 1, then the algorithm is not converging properly, and a smaller rate might help. omega: This should increase monotonically to 1 as the algorithm proceeds. An overshoot of 1 may mean that the rate should be made smaller. status: A status of 0 means the algorithm has converged. Each digit of the status indicates that convergence has not yet happened (see mem.c for what they mean).
rate <relative step size>
This gives the relative rate at which the algorithm takes step sizes. If the algorithm is not converging then a smaller rate often helps. By default, <relative step size> = 1.
def <level multiplier>
The maxent algorithm starts out with a flat spectrum of a certain level. This will multiply that level by <level multiplier>. In general, this should not need to be used, but if convergence is a problem, then this is another parameter to play with. By default, <level multiplier> = 1.
Necessary commands for each dimension of the script file:
dim <dimension>
This starts the section for the given <dimension>. Note that dimensions are relative to the slice, not the entire file. Thus, using 2-dimensional maxent on a 4-dimensional data file will lead to use of 'dim 1' and 'dim 2' in the script file. This must appear even for 1-dimensional maxent, in which case it would say 'dim 1'.
npts <number of points>
This is the size of the spectrum (i.e. the output from the maximum entropy algorithm) in this dimension.
Optional commands for each dimension of the script file:
complex
This indicates that the data is complex. The default for 'process' is that the data is real, and so it might be necessary to explictly include this for every dimension in the script.
ignore <point 1> ... <point N>
This lists the points to ignore in the maxent processing. The points are specified in real points, even if the data is complex (this is because only half of a complex pair might be no good). Thus 'ignore 1 2' might be used if the first (complex) data point is known to be corrupt. Use sparingly. By default, no points are ignored.
sample <point1> ... <point N>
This lists the points that have been sampled in the input data set relative to the usual linearly (in time) sampled set. The points are specified in real points, even if the data is complex (this is because only half of a complex pair might have been sampled). If 'sample' occurs for a given dimension then the number of points listed must be equal to the number of points of the input data in that dimension. Since the list of points might be lengthy, 'sample' can occur more than once for a given dimension. The points have to be in the order that they were sampled but the samples do not have to be in increasing order. As an example, if the number of input data points is 16 and the sample size is 32 then the following might be a use of 'sample':
sample 1 2 3 4 7 8 11 12 ! first 8 points sample 15 16 21 22 27 28 31 32 ! remaining points
sample2 <npts1> <npts2> <points file>
This specifies the points for a 2D sampling scheme which is only allowed for 2D maxent. <npts1> is the theoretical maximum number of points in the first maxent dimension, and <npts2> is the theoretical maximum number of points in the second maxent dimension. The <points file> contains a list that is of length the total number of points for the input data in the corresponding two dimensions. Each entry in the list is a point in the sample space, so with value between 1 and <npts1> * <npts2>. The points have to be in the order that they were sampled but the samples do not have to be in increasing order nor do they have to lie on a subgrid (otherwise you could just use sample twice). The points are specified in real points, even if the data is complex. As an example, if the number of input data points is 8x16 and is being sampled from a grid of size 32x64 then the following might be a use of 'sample2':
sample2 32 64 pointList
where point list contains 8x16 = 128 values, each of which is between 1 and 16x32 = 512.
sample3 <npts1> <npts2> <npts3> <points file>
This specifies the points for a 3D sampling scheme which is only allowed for 3D maxent. <npts1> is the theoretical maximum number of points in the first maxent dimension, <npts2> is the theoretical maximum number of points in the second maxent dimension, and <npts3> is the theoretical maximum number of points in the third maxent dimension. The <points file> contains a list that is of length the total number of points for the input data in the corresponding three dimensions. Each entry in the list is a point in the sample space, so with value between 1 and <npts1> * <npts2> * <npts3>. The points have to be in the order that they were sampled but the samples do not have to be in increasing order nor do they have to lie on a subgrid (otherwise you could just use sample thrice). The points are specified in real points, even if the data is complex. As an example, if the number of input data points is 4x4x8 and the sample size is 8x8x16 then the following might be a use of 'sample3':
sample3 8 8 16 pointList
where point list contains 4x4x8 = 128 values, each of which is between 1 and 8x8x16 = 1024.
decay <end value>
This multiplies the data by a decaying exponential, whose value is 1 at point 1 and <end value> at the last point. This is not normally needed, but experimentation with <end value> both greater and less than 1 is possible. By default, no multiplication is done.
phase <phase0> <phase1>
This phases the data with specified parameters <phase0> and <phase1> (both specified in degrees). (The pivot is assumed to be the first point.) By default, no phasing is done.
A typical script file for 2-dimensional maxent might be
! typical script file
iter 20 ! maximum number of iterations noise 50.0 ! noise level rate 0.3 ! relative step size log max2.log ! log file
dim 1 complex ! input data is complex npts 256 ! output data size phase 180 -360 ! phasing for this dim.
dim 2 complex ! input data is complex npts 128 ! output data size phase 180 -360 ! phasing for this dim. Azara help: process / W. Boucher / azara@bioc.cam.ac.uk