The apache webserver is free and available on a wide variety of UNIX operating systems (as well as Windows). Thus if you want to create a mirror of this site, it is not a problem to use a free web-server that the CCP14 will help you with.
It is presently the most popular web-server on the internet and comes with convenient, enhanced functionality not easily available on other web-servers. This includes a powerful redirection meta-language as well as fixing up common spelling errors. Perl can also be compiled inside the server to provide extra increases in performance for Perl based scripts
There is a large base of on-line, web and newsgroup based help, commbined and technical books (including a Apache for Dummies book) meaning problems can generally be solved. Linux on PC is probably a better system to use than an SGI O2 as it has a wide user base; but an SGI O2 was chosen as this type of system is supported at the Daresbury Laboratory.
As of writing this, Apache 1.3.0 (1.3.1 just announced) is the latest version but seems to be buggy on an SGI O2 running IRIX 6.3. While the server compiled very easily using the ./configure command, on attempting to run the server, a variety of errors occured. There are a number of complaints about this version of Apache and it looks like 1.2.6 is a more tested version to be using. For CCP14 based applications, Apache 1.2.6 is quite adequate.
Be wary on using 1.3.0 documentation as a guide for doing on the fly
configuring of 1.2.6. It is not easy to find the old 1.2.6 documentation
on the website. It is at
http://www.apache.org/docs-1.2/ or
http://www.hensa.ac.uk/mirrors/apache/docs-1.2/
The main Apache website is at http://www.apache.org.
On entering the website, it is possible to go to a close mirror site, of which there are many all over the place. You can find a closer mirror site automatically using the Apache "find a closer mirror site" CGI script at http://www.apache.org/dyn/closer.cgi.
The Apache distribution directory is at http://www.apache.org/dist/. Download the apache_1.2.6.tar.gz (or .Z) file. (It is possible to get binaries).
"Mod_Throttle User specific throttling for Apache 1.2.x Cut to the chase I'll admit it - I've been web master for a number of adult sites, some professional and some personal. One thing you can be certain of with sites with adult content is that you're going to have traffic. The first site I set up on a personal basis sat behind a 28.8kbps modem link, what a mistake. As soon as the site was discovered, it was saturated. I upgraded to 128kbps frame relay link and swamped it within a week or so as well. After a bit, I had to upgrade to 384kbps, but no joy. At that point I had to grit my teeth and go all the way to T1. One thing about T1 links is that they are expensive, and I started selling web hosting to support my habit. Unfortunately, on the first night at T1 speeds, I got a rather irate phone call from my up stream provider. It seemed I was maxing out the link and was all my my lonesome swamping the Cisco 2500 that I was attached to. ...stuff deleted... In the end I wrote mod_throttle. Mod_throttle started as a series of hacks on mod_limit, but there's very little code left from it at this point. Mod_throttle's main function is to set average Bps limits for individual customers. It also supports a greatly enhanced version of mod_limit's status display, so you can actually see who's pushing the envelope. Mod_throttle works by inserting increasingly long delays in service when a user is over their limit. How well does it work? For me it works marvelously. It's reduced the memory/cpu loads on my server by two-thirds and keeps the users exactly in their limits with as little pain as possible."
The stages for compiling and installing the apache 1.2.6 web server are: (Note: You will need root mode to install the executable. While you can get around this with a /usr/local set to a user who administers this area but apache can only be start to listen to the default port 80 as "superuser/root".)
To extract the apache distribution file type either
This creates an apache_1.2.6 subdirectory. Go into this directory and then go into the src subdirectory (cd apache_1.2.6/src).
Copy the mod_speling.c module that you downloaded into the src directory.
Configuring the Configuration file
Module rewrite_module mod_rewrite.o(The mod_rewrite.o file will be created during compilation of apache.)
Module speling_module mod_speling.o
Run Configure script in the src directory (type ./Configure) and this will setup the makefile ready compiling.
Run make to compile the apache server and create the httpd executable.
The following may have to be done in "superuser/root" mode.
Assuming everything when nicely, run make install to install to:
The CCP14 root web page area starts from a directory called:
/web_disc/ccp14/web_area
One thing to note is that this directory is not the root area for http://www.ccp14.ac.uk. Because the CCP14 server uses virtual domains, it is best to have a fall back directory where very old web browsers will go that do not recognise the virtual domain name protocol. This is the equivalent of going to the sv1.ccp14.ac.uk webserver (the real name of the O2 computer) where you are encouraged to upgrade to a relatively modern web browser but also have the possibility of browsing around from this area into the CCP14 web area using your old browser.
Other decisions are how to setup virtual domains, of which the domain names have to be inserted into the DNS. The network administrator would normally implement this. The virtual domains have been devided in such a way that people who would only want to mirror the roughly crystallographic part of the webserver could do so.
The main configuration stages for apache 1.2.6 are:
Generally most of the Apache options are safe but several options have to be set before the server will run correctly:
Document Root, in the CCP14 case:
<Directory /usr/people/ccp14/web_area>
Setting up server-status
Enable the following which allows you to look at problems with the web-server in real time via a web-interface. However, in theory, this would only work from within the computers/domains defined. http://www.ccp14.ac.uk/server-status
<Location /server-status> SetHandler server-status order deny,allow deny from all allow from .dl.ac.uk allow from .ccp14.ac.uk </Location>
This might also be a good time to tighten up the CGI security. Following is also how this was done, and also setting up for the RIB system which gives an example of how to restrict access by domain and/or password. (note: /cgi-bin is aliases to these real directories in httpd.conf and srm.conf)
<Directory /usr/local/etc/httpd/cgi-bin> AllowOverride None Options +ExecCGI -Indexes #means people cannot browse directories #but can run the CGI scripts </Directory> <Directory /usr/local/rib> AllowOverride None Options +ExecCGI -Indexes #means people cannot browse directories #but can run the CGI scripts order allow,deny allow from all </Directory> <Directory /usr/local/rib/cgi-bin/admin> AllowOverride All #Means you can define a .htaccess file #that only people who supply a username and password #can do things in the admin area Options +ExecCGI +Indexes #means directories can be browsed #and can also run the CGI scripts order deny,allow deny from all allow from .dl.ac.uk #In theory, only people from these allow from .ccp14.ac.uk #domains can work in the RIB admin area </Directory>Then in the /usr/local/rib/cgi-bin/admin directory have a .htaccess file with the following contents. (thanks to RIB people at http://www.nhse.org for help with this .htaccess file.
AuthUserFile /usr/local/etc/httpd/doobry-twang/doobry/.htpasswd #not real directory AuthGroupFile /dev/null AuthName doobryribusername #not real username - use something you can remember (6 to 8 characters is good) AuthType Basic </Limit GET> require user doobryribusername #not real username - use something you can remember (6 to 8 characters is good) </Limit> <Limit POST> require user doobryribusername #not real username - use something you can remember (6 to 8 characters is good) </Limit> </Directory>Then in the /usr/local/etc/httpd/doobry-twang/doobry directory (give it a name you know of) have a .htpasswd file with the following contents. (again, thanks to RIB people at http://www.nhse.org for help with this .htaccess file.
doobryribusername:kdinsldkldfkLeave the encrypted password blank if you just want to enter a username and no password. (not recommended) Otherwise, just use the UNIX crypt command to generate your encrypted password to insert into the .htpasswd file.
Then, providing you have access permissions (and the username and password), you can go into http://www.ccp14.ac.uk/rib/cgi-bin/admin/RIB.pl to administer the RIB system.
#MOD SPELLING MODULE CheckSpelling On
# Redirect people who access admin areas somewhere else. RewriteEngine on RewriteRule ^/bin/(.*) http://www.ccp14.ac.uk/bad-link.html [R] RewriteRule ^/dev/(.*) http://www.ccp14.ac.uk/bad-link.html [R] RewriteRule ^/etc/(.*) http://www.ccp14.ac.uk/bad-link.html [R] RewriteRule ^/mirrorbin/(.*) http://www.ccp14.ac.uk/bad-link.html [R] RewriteRule ^/lib/(.*) http://www.ccp14.ac.uk/bad-link.html [L,R]
<VirtualHost 193.62.124.194> ServerAdmin l.cranswick@dl.ac.uk DocumentRoot /web_disc/ccp14/web_area/web_live ServerName www.ccp14.ac.uk Alias /rib/ /usr/local/rib/ ScriptAlias /cgi-bin/ /usr/local/etc/httpd/cgi-bin/ccp14/ ScriptAlias /rib/cgi-bin/ /usr/local/rib/cgi-bin/ RewriteEngine on RewriteRule ^//(.*) /$1 [R] #the www.dl.ac.uk/CCP/CCP14 redirect seems to add an extra slash - #so quick kludge to fix this up RewriteRule ^/CCP/CCP14/(.*) http://www.ccp14.ac.uk/$1 [R] #In case people still have the www.dl.ac.uk/CCP/CCP14 mindset #The following is to redirect information mirrored under www.dl.ac.uk/CCP/CCP14 #Into their own virtual domains. RewriteRule ^/ccp/ccp14/ftp-mirror/programming/(.*) http://programming.ccp14.ac.uk/ftp-mirror/programming/$1 [R] RewriteRule ^//ccp/ccp14/ftp-mirror/programming/(.*) http://programming.ccp14.ac.uk/ftp-mirror/programming/$1 [R] RewriteRule ^/ccp/web-mirrors/ill-hewat/(.*) http://icsd.ccp14.ac.uk/$1 [R] RewriteRule ^//ccp/web-mirrors/ill-hewat/(.*) http://icsd.ccp14.ac.uk/$1 [R] RewriteRule ^/ccp/web-mirrors/programming/gnu/(.*) http://gnu.ccp14.ac.uk/$1 [R] RewriteRule ^//ccp/web-mirrors/programming/gnu/(.*) http://gnu.ccp14.ac.uk/$1 [R] RewriteRule ^/ccp/web-mirrors/alife/santafe/(.*) http://www.santafe.edu/$1 [R] RewriteRule ^//ccp/web-mirrors/alife/santafe/(.*) http://www.santafe.edu/$1 [R] RewriteRule ^/ccp/ccp14/ftp-mirror/alife/santafe/(.*) ftp://ftp.santafe.edu/$1 [R] RewriteRule ^//ccp/ccp14/ftp-mirror/alife/santafe/(.*) ftp://ftp.santafe.edu/$1 [R] RewriteRule ^/ccp/web-mirrors/alife/(.*) http://alife.ccp14.ac.uk/$1 [R] RewriteRule ^//ccp/web-mirrors/alife/(.*) http://alife.ccp14.ac.uk/$1 [R] RewriteRule ^/ccp/ccp14/ftp-mirror/programming/(.*) http://programming.ccp14.ac.uk/ftp-mirror/programming/$1 [R] RewriteRule ^//ccp/ccp14/ftp-mirror/programming/(.*) http://programming.ccp14.ac.uk/ftp-mirror/programming/$1 [R] RewriteRule ^/ccp/ccp14/ftp-mirror/alife/(.*) http://alife.ccp14.ac.uk/ftp-mirror/alife/$1 [R] RewriteRule ^//ccp/ccp14/ftp-mirror/alife/(.*) http://alife.ccp14.ac.uk/ftp-mirror/alife/$1 [R] RewriteRule ^/ccp/web-mirrors/programming/nhse/(.*) http://www.nhse.org/$1 [R] RewriteRule ^//ccp/web-mirrors/programming/nhse/(.*) http://www.nhse.org/$1 [R] RewriteRule ^/ccp/web-mirrors/programming/netlib/(.*) http://netlib.ccp14.ac.uk/$1 [R] RewriteRule ^//ccp/web-mirrors/programming/netlib/(.*) http://netlib.ccp14.ac.uk/$1 [R] RewriteRule ^/ccp/web-mirrors/programming/(.*) http://programming.ccp14.ac.uk/$1 [R] RewriteRule ^//ccp/web-mirrors/programming/(.*) http://programming.ccp14.ac.uk/$1 [R] RewriteRule ^/ccp/ccp14/ftp-mirror/programming(.*) http://programming.ccp14.ac.uk/ftp-mirror/programming$1 [L] RewriteRule ^//ccp/ccp14/ftp-mirror/programming(.*) http://programming.ccp14.ac.uk/ftp-mirror/programming$1 [L,R] ErrorLog logs/www.ccp14.ac.uk-error_log TransferLog logs/www.ccp14.ac.uk-access_log </VirtualHost>
<VirtualHost 193.62.124.194> ServerAdmin l.cranswick@dl.ac.uk DocumentRoot /web_disc/ccp14/web_area/netlib ServerName netlib.ccp14.ac.uk ScriptAlias /cgi-bin/ /usr/local/etc/httpd/cgi-bin/netlib/ ErrorLog logs/netlib.ccp14.ac.uk-error_log TransferLog logs/netlib.ccp14.ac.uk-access_log </VirtualHost>
<VirtualHost 193.62.124.194> ServerAdmin l.cranswick@dl.ac.uk DocumentRoot /web_disc/ccp14/web_area/gnu ServerName gnu.ccp14.ac.uk ScriptAlias /cgi-bin/ /usr/local/etc/httpd/cgi-bin/gnu/ ErrorLog logs/gnu.ccp14.ac.uk-error_log TransferLog logs/gnu.ccp14.ac.uk-access_log </VirtualHost>
<VirtualHost 193.62.124.194> ServerAdmin l.cranswick@dl.ac.uk DocumentRoot /web_disc/ccp14/web_area/alife ServerName alife.ccp14.ac.uk ScriptAlias /cgi-bin/ /usr/local/etc/httpd/cgi-bin/alife/ ErrorLog logs/alife.ccp14.ac.uk-error_log TransferLog logs/alife.ccp14.ac.uk-access_log </VirtualHost>
<VirtualHost 193.62.124.194> ServerAdmin l.cranswick@dl.ac.uk DocumentRoot /web_disc/ccp14/web_area/programming ServerName programming.ccp14.ac.uk RewriteEngine on RewriteRule ^/cgi-bin/netlibget.pl/(.*) http://www.netlib.org/cgi-bin/netlibget.pl/$1 [R] RewriteRule ^/netlib/(.*) http://netlib.ccp14.ac.uk/$1 [R,L] ScriptAlias /cgi-bin/ /usr/local/etc/httpd/cgi-bin/programming/ ErrorLog logs/programming.ccp14.ac.uk-error_log TransferLog logs/programming.ccp14.ac.uk-access_log </VirtualHost>
<VirtualHost 193.62.124.194> ServerAdmin l.cranswick@dl.ac.uk DocumentRoot /web_disc/ccp14/web_area/icsd ServerName icsd.ccp14.ac.uk RewriteRule ^/dif/(.*) /$1 [R,L] ScriptAlias /cgi-bin/ /usr/local/etc/httpd/cgi-bin/icsd/ ErrorLog logs/icsd.ccp14.ac.uk-error_log TransferLog logs/icsd.ccp14.ac.uk-access_log </VirtualHost>
The mime.types file sets the file extensions that the apache web server will properly recognise. Normally this is kosher but a few things seem to be missing that have to be added manually.
text/html html htm
application/x-httpd-cgi pl
application/x-compress Z application/x-gzip gz
application/octet-stream bin dms lha lzh exe class rd sd raw text/plain txt text
#1) plain text ErrorDocument 500 http://www.ccp14.ac.uk/bad-link.html # 2) local redirects ErrorDocument 404 http://www.ccp14.ac.uk/bad-link.html # 3) external redirects ErrorDocument 402 http://www.ccp14.ac.uk/bad-link.html
As http://www.ccp14.ac.uk/ was originally at http://www.dl.ac.uk/CCP/CCP14/, it is important that everything redirects. All decent web-servers have a redirect option and it is just a matter of requesting the web-administrator of that machine to set the redirects. http://www.dl.ac.uk/CCP/CCP14/ is a Netscape based server and has no problem doing this.
The same goes with an old address on the gserv1.dl.ac.uk machine such this it also redirects correctly. http://gserv1.dl.ac.uk/CCP/CCP14/
To Stop:
kill -TERM `cat /usr/local/etc/httpd/logs/httpd.pid`After it has stopped, to start:
/usr/local/etc/httpd/httpd
Refer for info on this under: Installing Apache 1.3.3 - Automatic startup of the server
http://www.ccp14.ac.uk/server-status
Another method is Telnet to the server and check if it is running that way.
telnet www.ccp14.ac.uk 80 [RETURN](80 stands for port 80) HEAD / HTTP/1.0 (type [RETURN] twice)You should get the output of something akin to:
HTTP/1.1 200 OK Date: Thu, 06 Aug 1998 16:42:15 GMT Server: Apache/1.2.6 Last-Modified: Fri, 10 Jul 1998 14:03:06 GMT ETag: "41d6d8-12be-35a61f1a" Content-Length: 4798 Accept-Ranges: bytes Connection: close Content-Type: text/html Connection closed by foreign host.