Persistent CGI is an architecture designed by
Digital Creations
to publish Bobo
web objects as long-running processes.
The pcgi component is an integral part of DC's
Principia
product, but is also provided to Bobo developers
under
http://starship.skyport.net/crew/jbauer/persistcgi/pcgifile
There is no point in even beginning with pcgi unless your server environment is suitably configured. As a checkpoint, you may want to put the following test script in your HTTP server cgi-bin directory and test accessing it from your browser.
#!/usr/local/bin/python print "Content-type: text/html" print import os, sys import cgi_module_publisher import CGIResponse print "looks OK"
If you cannot get the above test script to work in your environment, it may be necessary to specify either PYTHONPATH or PCGI_INSERT_PATH directives in your pcgi info file.
Our example will publish a simple module called pcgitime. The purpose of this module is to display the current local server time and the process start time.
For example, the initial invocation of pcgitime will display something similar to:
time started: Mon May 11 22:40:57 1998 current time: Mon May 11 22:40:57 1998Subsequent requests of:
Display new values for the current time:
time started: Mon May 11 22:40:57 1998 current time: Mon May 11 22:41:02 1998
Note the pcgi info file is pcgitest, but the module is pcgitime.py.
Q: How do the contents of the pcgi info file pcgitest
get passed to pcgi_publisher.py?
A: All pcgi info files use a header line to
launch the pcgi-wrapper program:
#!/usr/local/bin/pcgi-wrapper . . .The pcgi-wrapper program is executed and reads in the remaining lines (directives) in the info file, these are in turn passed on to the pcgi_publisher. The pcgi-wrapper module is also responsible for starting the persistent process, if it's not already running.
The remaining steps to create a long-running pcgitest process will be described in detail below:
The pcgitime.py module is a simple Python script. For this example we'll assume pcgitime.py is copied to /www/cgi-bin, although there's no requirement to put it in the cgi-bin directory.
# pcgitime.py - pcgi test module - JeffBauer@bigfoot.com from time import asctime, localtime, time beginTime = "<html><pre>time started: %s" % \ asctime(localtime(time())) def getTime(arg=None): """Return the beginning process and current server time.""" return "%s\ncurrent time: %s" % \ (beginTime, asctime(localtime(time())))
Note particularly: The getTime() triple-quoted docstring is not optional because the publisher will only recognize functions and methods with docstrings.
The pcgitest info file is much more site dependent than
the pcgitime.py module. As an example of what your
pcgi info file might look like, the example pcgitest file below
demonstrates some pcgi directives and their values. Copy the
pcgitest info file to your cgi-bin directory, and verify
it has read/execute permissions, e.g. chmod a+rx pcgitest
#!/usr/local/bin/pcgi-wrapper PCGI_NAME=pcgitime PCGI_MODULE_PATH=/www/cgi-bin/pcgitime.py PCGI_PUBLISHER=/www/cgi-bin/pcgi_publisher.py PCGI_EXE=/usr/local/bin/python PCGI_SOCKET_FILE=/home/jbauer/var/pcgitime.socket PCGI_PID_FILE=/home/jbauer/var/pcgitime.pid PCGI_ERROR_LOG=/home/jbauer/var/pcgitime.err PCGI_DISPLAY_ERRORS=Y
A brief description of each directive used in pcgitest:
PCGI_NAME | recommended | Module name (excluding path or extension) |
PCGI_MODULE_PATH | user specified | Fully qualified filename of the user module to be published. |
PCGI_PUBLISHER | recommended | Path of the publisher to be invoked. If not specified in the pcgi info file or by the environment, pcgi-wrapper will make its best effort to locate pcgi_publisher.py[o|c]. Even if pcgi_publisher.pyc is known to be in the PCGI_INSERT_PATH, an explicit directive will streamline processing. |
PCGI_EXE | recommended | Fully qualified filename of pcgi executable (python), i.e. PCGI_EXE=/usr/local/bin/python |
PCGI_SOCKET_FILE | user specified | Fully qualified filename of the socket file. The socket file is the named pipe unless INET sockets are used, in which case the file lists the hostname and port number. Formerly the first parameter in the original info file structure. |
PCGI_PID_FILE | user specified | Fully qualified filename of the process id file. The process file contains the pid of the long-running process. Formerly the second parameter of the original info file structure. |
PCGI_ERROR_LOG | user specified | Fully qualified filename of the error log file. The error log may be written to by pcgi-wrapper, pcgi_publisher, or the user (Bobo) object implementation. |
*NEW* PCGI_DISPLAY_ERRORS | boolean | Display the pcgi-wrapper error message. More useful than the standard "temporarily unavailable" error message. |
For a complete list of pcgi info file directives, please refer to:
Global directives: It may be desirable to specify a directive globally (e.g. PCGI_EXE or PCGI_PUBLISHER), rather than in each pcgi info file. This may be accomplished by adding it to your httpd configuration file. In Apache, the directive would be specified using SetEnv in srm.conf.
User-defined directives: In addition to the standard PCGI directives, it is possible to include any directives you care to add. All directives appearing in the pcgi info file will be treated as environment variables. Therefore it is possible to specify a new PYTHONPATH directive, for instance, which will be used by your published module.
Known bugs: The parser is sensitive to some kinds of whitespace. Until this is fixed, it's best to left-justify everything and trim trailing whitespace characters.
!#
, followed by the absolute path of the pcgi-wrapper file.
Since the steps listed above may be tedious, I've written a short
script (originally as a exercise to consolidate pcgi versions) called
pcgifile.py to automate the process. Further information
may be found at:
http://starship.skyport.net/crew/jbauer/persistcgi/pcgifile
To use pcgifile.py, copy it to your cgi-bin directory and access it through your browser:
http://.../cgi-bin/pcgifile.py?filename=pcgitestThe program will return an advisory response. Since it does not actually run the pcgi process, pcgifile.py can only predict what may occur. Nevertheless, it is likely to catch some of the most egregious oversights. An example of the output you may expect to see:
Python 1.5 (#12, Apr 20 1998, 16:40:28) [GCC 2.7.2.3] Apache/1.2b10 PHP/FI-2.0b11 PCGI info file: pcgitest PCGI wrapper: /starship/jbauer/pcgi/pcgi-wrapper advisory recommendation: specify PCGI_EXE advisory recommendation: specify PCGI_NAME looks OK pcgitest #!/starship/jbauer/pcgi/pcgi-wrapper PCGI_MODULE_PATH=/starship/jbauer/pcgi/pcgitime.py PCGI_SOCKET_FILE=/starship/jbauer/pcgi/pcgitime.soc PCGI_PID_FILE=/starship/jbauer/pcgi/pcgitime.pid Likely publisher resource values: Executable: /usr/local/bin/python PID file: /starship/jbauer/pcgi/pcgitime.pid Socket file: /starship/jbauer/pcgi/pcgitime.soc Publisher: pcgi_publisher.pyc (path not specified) Module: /starship/jbauer/pcgi/pcgitime.py Resulting environment will probably appear to the publisher as: DOCUMENT_ROOT /www/htdocs/starship FILEPATH_INFO /crew/jbauer/persistcgi/pcgifile/pcgifile.pyc . . .Note that the header line in pcgifile.py expects to see python located in /usr/local/bin, so change it if necessary for your site.
Hint: If you are reporting a pcgi error to the Bobo mailing list, the output from pcgifile.py will measurably speed up the diagnosis of your problem.
The first hit will take a while to respond, as pcgi-wrapper tries to launch and connect to the publisher. If fortune favors your efforts, you should see something similar to:
time started: Mon May 11 22:40:57 1998 current time: Mon May 11 22:40:57 1998
Subsequent reload hits will display new values for the current time:
time started: Mon May 11 22:40:57 1998 current time: Mon May 11 22:41:02 1998To shut down the process, obtain the process id from the pid file (i.e. pcgitime.pid), kill the running process and remove the PID and socket files (PCGI_PID_FILE, PCGI_SOCKET_FILE). A script in the Util/ directory, killpcgi.py has been provided to terminate a pcgi process. (NEW)
Here are some basic troubleshooting techniques to discover what caused pcgi to fail.
First, consider using pcgifile.py (described above) if you have neglected it earlier. Its output will be useful to anyone reviewing your problem. Furthermore, if your problem is not caught by the sanity checker, your efforts may lead to improvements in the pcgifile.py script.
(NEW) Enable the debugging output directive: PCGI_DISPLAY_ERRORS=Y
Examine the output of PCGI_ERROR_LOG. The pcgi-wrapper and pcgi_publisher programs have been enhanced to provide better error reporting, and most errors should have at least a minimal amount of useful information. Don't forget to supply this critical piece of information with a bug report! (Also submit your environment info: operating system, http server, version of Python, Bobo, etc.) Note that the PCGI_ERROR_LOG file is created only during an abnormal termination, and then only if the parser has reached the stage where it has read in the PCGI_ERROR_LOG directive.
If you suspect your pcgi info file is not being parsed
correctly, try using parseinfo
to check it:
./Test/parseinfo .../cgi-bin/pcgitestThe
parseinfo
program provides more accurate
information than pcgifile.py
, since it actually
runs the same code as pcgi-wrapper
.
Don't forget to examine the error log of your httpd server! Several reported failures of pcgi could have been easily solved by examining the server error log contents.
Finally, if all else fails, try to determine where the problem occurs (pcgi-wrapper or pcgi_publisher.py) and pinpoint the location (specific lines) of where the error is manifested.
Note: In case it isn't 100% obvious, you can't add print/printf statements to debug the code. ;-)