General PERL questions
THE PERL SCRIPTS ARE NO LONGER BEEING SUPPORTED - PLEASE GO TO TETRABB.COM FOR THE NEWEST VERSION OF THE WEBBBS FORUM(1) What exactly is a CGI script, anyway?
CGI (“Common Gateway Interface”) is a term used to refer to programs which run on the server, to provide interactivity to Web sites. CGIs can be written in any programming language, but most are written in either Perl (a non-compiled, or “scripting” language) or C++ (a compiled, or “programming” language).
Client-side applications are a popular alternative to CGI scripts. These utilize JavaScript, Java applets, or other programs which actually run on the visitor’s own computer. On the plus side, such applications can greatly reduce the amount of processing your Web server has to do, and thus, can improve the performance of your Web site across the board. On the minus side, though, many visitors can’t or won’t accept or run such applications.
(2) I tried to run one of the scripts, but got a server error. What now?
The generic “500″ server error (“Malformed Header From Script”) is without a doubt the most common and least useful error message known to CGI programmers. The usual cause of the error is a file permission problem, but there are quite a few other possible causes. Try the following:
* First, check the relevant files and directories to ensure that file permissions are all set correctly. (If you’re not sure how to set file permissions, see below.) Also check to be sure that all variables are properly assigned in your configuration and that all files actually exist where they are supposed to.
* Second, make sure you have uploaded the files in ASCII (text) mode. If you upload scripts or other text files from a PC or a Macintosh to a UNIX system in binary mode, you’ll likely experience problems, as those systems use different linefeed conventions than UNIX. You must set your FTP client to ASCII (text) mode when uploading them.
* Third, check the first line of the script you’re executing (it should look something like “#!/usr/bin/perl”) and make sure that it points to the proper location of Perl on your system.
* Fourth, make sure you’re using the correct URL to access your scripts. This one sounds obvious, but you never know….
If you still can’t get things to work, try running the script from the UNIX command prompt instead of through your Web browser. Doing so will usually provide you with a much more helpful error message. (In the event that the script runs successfully from the command prompt, but not from your Web browser, you’re probably back to a permissions problem. Specifically, you’ll probably find if you look that only you, and not the world at large, have permission to execute the script.)
If you don’t have Telnet access to your server, an alternative is to check your server’s error logs. Sometimes they won’t say anything more helpful than “malformed header from script,” but other times, they might include error messages as helpful as those you would have gotten from the Telnet command prompt!
If you got a “501″ server error (“Cannot Post” or “Method Not Allowed”), it probably means that your server doesn’t allow you to run CGI scripts anywhere except from within your CGI-BIN directory. If this is the case, simply move the script where it belongs. It may also mean that you can’t use the “.pl” file extension to which most of my scripts default. If this is the case, simply rename the files with a “.cgi” extension, instead.
(Please note that e-mailing me or posting a message on the WebScripts Forum simply stating that you’re getting a server error will not get you a response any more helpful than the above. To be able to get a specific fix, you have to detail your specific problem.)
(3) How do I set file permissions? (What is chmod?)
The UNIX “chmod” command allows you to change the security settings (file permissions) for particular files, giving users and groups permission to read, write and/or execute those files. The documentation for the script in question should tell you what permissions need to be set for which files or directories.
The chmod command can be used in either of two different ways, depending upon your personal preferences. In the first version of the command, you specify the user group, the setting to be adjusted, and of course the file name. In the second version of the command, you use a numeric code to specify the user groups and settings. The first version of the command is probably easier for most users, since you’re only changing the settings you specify. (In the second version, you’re resetting everything.)
To use the first version of the chmod command, you simply specify the group or groups whose settings you want to change (“u” for user [the file owner], “g” for group [a UNIX setting largely irrelevant here], “o” for other [the world at large] or “a” for all), and the setting you want changed (“r” for read access, “w” for write access, and “x” for execute access). To set a particular file world-writable, for example, you could type “chmod o+w <filename>”; to set it executable by everyone, you could use “chmod a+x <filename>.” (The plus sign indicates that you’re adding a particular capability. To remove a certain type of access, use a minus sign.)
To use the second version of the chmod command, you specify with a three-digit numeric code all the permissions settings. The first digit refers to the file’s owner, the second to its group, and the third to the world at large. Each digit of the code is a number (0 thru 7) which specifies the permissions for that particular user or group. 0 indicates no access; 1 indicates execute access only; 2 indicates write access only; 3 indicates both write and execute access; 4 indicates read access only; 5 indicates read and execute access; 6 indicates read and write access; 7 indicates full (read, write and execute) access. For example, “chmod 765 <filename>” will set the file so that its owner has full access, its group has read and write access, and the world at large has read and execute access.
It’s often handy when debugging a script to set all relevant permissions to “777″ just to be sure that file permissions aren’t causing any trouble. However, it’s not usually a good idea to leave things that way, as it presents a potential security risk. Once you’re sure a script is functioning, you’ll probably want to “back down” the permissions as far as possible.
(For more detailed information about the “chmod” command, consult the relevant UNIX “man” [manual] page. At the UNIX command prompt, type “man chmod” to see it.)
(4) What is cron, and how do I use it?
The UNIX “cron” command allows you to set certain programs to run automatically at certain times. For example, you could set WebLog to run automatically in the early hours of every morning to process the previous day’s access log information. This command may or may not be available to users on your system. You should contact your system administrators (or consult the relevant UNIX “man” pages) to find out if you can use it, and if so, how to do so.
(5) A lot of configuration variables are supposed to be defined as the “full path” to a file or directory. Is that the same thing as a URL?
No; they are two very different things, as illustrated in the “example” configuration values provided in the scripts.
A URL (e.g., “http://www.foo.com/test.html”) is a “roadmap” to a specific file as seen from the outside (i.e., by someone connecting to it via the Web). It begins with a note regarding the type of connection to be made (“http://”), continues with a domain name (“www.foo.com”), and concludes with the route to the file starting from the domain’s “root directory” on the server.
A path (e.g., “/usr/username/www/test.html”) is a “roadmap” to that same file as seen from the inside (i.e., by the host server). It begins with the designation of the server’s root directory (“/usr”), and concludes with the route from there to the file. Unlike a URL, it does not feature any note regarding the type of connection; also, as it begins at the server’s root directory rather than a specific domain’s root directory, it is usually a bit longer than a URL.
(6) OK, well, how do I find the path to a file?
The UNIX “pwd” command will give you that information. Make sure you’re in the correct directory, then type “pwd” at the command prompt. The response should look something like “/usr/www/users/username/directory/”; the specifics, of course, can and will vary.
(7) In all your example pathnames, “/usr/” appears only once. In my paths, it appears twice. Is that a problem?
Some ISPs set up what they call “virtual servers.” On many of these systems, file paths are a bit odd, looking something like “/usr/home/username/usr/local/etc/httpd/htdocs/,” which appears to be a partial user-specific path followed by an independently complete path.
Inevitably, on these sytems, if you assign your configuration variables with full pathnames, the script will work from the command line, but not from the browser. If you assign them without the “extra” user-specific information, they will no longer work from the command line, but they will work when called from a browser. It’s an either/or situation. The virtual server’s home directory (/usr/home/username/) is necessary for telnet access, but becomes the root directory (/) when accessed via the Web. Therefore, while the full path (/usr/home/username/usr/local/etc/httpd/htdocs/…) must be used via telnet, only the latter part is used for Web-based references. Scripts can be set to work via telnet, or from the Web, but not both at the same time.
On some of these systems, it is possible to test Web-based applications from telnet by using the command “virtual” to prefix the execute statement: “virtual ./script.cgi.” If that command is not available to you, you’ll simply need to make sure the configuration is appropriate for the manner in which you’re trying to run the script.
(8) Wait a minute, here. What is UNIX, anyway, and where do I find the command prompt you keep mentioning?
UNIX is an operating system. Though it is of course not the only OS around, it is the most popular platform (environment) for Internet use and for Web-based applications. It comes in a variety of “flavors,” including HP-UX, SunOS, OSF/1, Linux and BSD. The scripts in the WebScripts collection were all created and tested on a UNIX system (specifically, an Apache server running FreeBSD).
The command prompt (or system prompt) refers to the Korn, C or other “shell” which serves as an interface between you and the OS. This prompt is usually reached by connecting via a Telnet program to your server. From it, you can move and copy files, set file permissions, debug and run programs, etc.
(9) How do I run a program from Telnet?
The easiest way to run a program from Telnet is to “cd” to the directory in which the program resides, then type “./filename” (where filename, of course, is the name of the program). A “run” or “execute” statement isn’t required, as it’s understood by the system to be there. Note that the initial “./” is necessary, as it tells the system that the file is to be found in the current directory.
You can also run a program from anywhere, by typing the full directory path to its location.
(10) I don’t have Telnet access….
Then you probably have a problem.
Some of the scripts in the WebScripts collection (for example, WebLog) must be run from the command prompt or via cron. Without Telnet access, you probably won’t be able to use them at all.
Others require Telnet access only for installation purposes. You might still be able to use these. Some FTP clients, for example, will allow you to set file permissions and handle some of the other functions for which you’d normally use a Telnet program, so even if you only have FTP access, you might still be OK. It really depends upon exactly what your server and software will allow you to do. As usual, if you have questions, check with your system administrators.
(11) What’s a CGI-BIN directory?
The CGI-BIN directory is a special directory within which CGI scripts must be located on some servers. On these servers, scripts located anywhere else will not run, but will simply display as text files when called by a browser.
Other servers, however, are set to allow scripts to be located anywhere, so long as they are designated with the appropriate file extension (“.cgi”), while still other servers will treat any file as a script and attempt to execute it just as long as its world-executable bit is set.
Again, check with your system administrators to find out where you can locate CGI scripts and what, if any, special file extensions you need to use. Changing file extensions or file names will not break any of the scripts in the WebScripts archive, so long as you make sure to adjust any configuration variables as necessary to point to the correct (changed) file names.
(12) I’m trying to set up one of your scripts on a (Windows NT / Windows 95 / other non-UNIX) system. How do I….
No offense intended, but if your specific problem isn’t already addressed in this list, I probably can’t answer it. My scripts are all designed and tested on a UNIX system; that is the only server environment with which I am familiar and the only one to which I have access. As much as I might like to be able to assist you in debugging whatever problems you’re having with my scripts on other systems, I’m unable to do so. There are many people using the scripts on non-standard servers, though, so the problem you’re facing, whatever it might be, has likely already been solved. Your best bet is to post a message on one of the WebScripts support forums, requesting assistance from someone familiar with your server type.
(13) Can I run the scripts through CGIwrap?
I’ve heard reports that some users are running various WebScripts scripts through CGIwrap without any difficulty. However, as I’ve never used it myself, I really can’t help much. If you’re having problems getting things to work, your best bet again is to post a message on the WebScripts Forum.
(14) I’m using your scripts on a very busy site, and the server load is getting a bit excessive. What can I do about it?
While Perl has many advantages as a language for CGI scripting, it also has disadvantages. Chief among those, of course, is the fact that it’s not a precompiled language. Each time a script is called, it has to be loaded and compiled by the server. On busy systems, that can eventually generate a significant load.
There really isn’t any “one size fits all” solution. One possibility is to rewrite the scripts in a precompiled language such as C++. (I’ve heard rumors of a few folk trying that, but I’ve no idea how well they’ve fared.) Another, of course, if you happen to have resources to burn, is to get your own server.
A more practical possibility which seems to have worked for at least a couple of users on busy sites is the installation of mod_perl. I’m not really familiar with it, myself, but it’s apparently a part of the Apache 1.2.0 package. (Similar modules probably exist for other servers, as well.) Essentially, what this does is keep a compiled version of the script always in memory; each time the script is called, instead of reloading and recompiling, the server just calls the version it already has ready. Some changes are necessary in the scripts (to allow them to function properly with “use strict” in effect), but apparently, this can result in a major improvement in speed and a significant drop in server load.
And, of course, I’m always working to improve the efficiency of my scripts. If you run into problems, check by the WebScripts site to see if a newer, more efficient version is available (or at least in progress). And feel free to ask on the forums for helpful tips from others about improving the performance of the specific script with which you’re working.
Recent Comments