[CCP14 Home: (Frames | No Frames)]
CCP14 Mirrors: [UK] | [CA] | [US] | [AU]

(This Webpage Page in No Frames Mode)

Collaborative Computational Project Number 14

for Single Crystal and Powder Diffraction

CCP14

Generating Monthly Web Stats using http-analyze

The CCP14 Homepage is at http://www.ccp14.ac.uk

[Back to CCP14 Web/Config Main Page]

[Why use httpanalyze] | Where to get httpanalyze] | Installing httpanalyze]
[Setting up the httpanalyze Configuration File] | [Setting up a httpanalyze Script File] | Examining the webstats]


The following was for installation on an SGI O2 running IRIX but should work on most UNIX systems will very little change.

Why Use httpanalyze

Http-analyze is free for academic usage but is worth paying for as it is very high quality software and enables the author to keep churning out updates. CCP14 has a commercial registered copy and some of the utilities used in the following scripts are only available to registered users. (Though alternative freeware options are given)

Http-analyze is fast and gives very nice graphs and good stats to keep people happy and amused. Nuff Said!


Where to get httpanalyze

Http-analyze is obtainable as binaries via:

Help files.


Compiling and Installing httpanalyze


Setting up the httpanalyze Configuration File

Two config files have been created. One for all logs, and another "filtered" config file that primarily shows academic usage of the site. Will elaborate another time on the nuances.


Setting up a httpanalyze Script File

Originally, this was run once per month manually but it can save hassle getting this setup in a script. Though the months are changed manually and checked as on some occassions, automated scripts can go beserk. This take a while to run due to the resolving of IP addresses. It may take around 14 hours to resolve the IP addresses to names.

To rotate the logs and call the scripts just after midnight, an "at" script like the following can do the job:

Apache monthly rotate weblogs script

!/sbin/csh

# set DATE=`date +Date_%y_%m_%d_`
# DATE=`date '+Date_%y_%m_%d_Time_%H_%M__`
# DATE=`date '+Date_%y_august`
# echo ${DATE}

#must restart apache after moving logs or bad things happen
# Refer: http://httpd.apache.org/docs/logs.html#rotation

mkdir /usr/local/apache/logs/2002_october
mv /usr/local/apache/logs/*_log /usr/local/apache/logs/2002_october

# /usr/local/apache/bin/apachectl configtest
/usr/local/apache/bin/apachectl stop  > /web_disc/ccp14/cron_scripts/logswapweb.log
sleep 4
/usr/local/apache/bin/apachectl start >> /web_disc/ccp14/cron_scripts/logswapweb.log


Weblogs script running httpd-analyze

Note: for compiling up ipresolve on linux do:

cc ipresolve.c -o ipresolve -I/usr/include/gdbm/ -L/usr/lib -lgdbm

Otherwise you can get the following type of problems:

/tmp/ccgRRQfw.o: In function `storeIP':
/tmp/ccgRRQfw.o(.text+0x35): undefined reference to `dbm_store'
/tmp/ccgRRQfw.o: In function `deleteIP':
/tmp/ccgRRQfw.o(.text+0x74): undefined reference to `dbm_delete'
/tmp/ccgRRQfw.o: In function `lookupIP':
/tmp/ccgRRQfw.o(.text+0xb2): undefined reference to `dbm_fetch'
/tmp/ccgRRQfw.o: In function `openDB':
/tmp/ccgRRQfw.o(.text+0xd7): undefined reference to `dbm_open'
/tmp/ccgRRQfw.o: In function `closeDB':
/tmp/ccgRRQfw.o(.text+0xf1): undefined reference to `dbm_close'


#!/sbin/csh

# set DATE=`date +Date_%y_%m_%d_`
# DATE=`date '+Date_%y_%m_%d_Time_%H_%M__`
# DATE=`date '+Date_%y_august`
# echo ${DATE}

# mkdir /usr/local/apache/logs/2002_october

grep 'Oct/2002' /usr/local/apache/logs/2002_october/access_log >>  \
   /web_disc/ccp14/web-logs/october2002_unresolved.log

/usr/local/bin/ipresolve  \
   /web_disc/ccp14/web-logs/october2002_unresolved.log > \
   /web_disc/ccp14/web-logs/october2002_resolved.log

# ipresolve -h   for help
# ipresolve -v   verbose - gives feeback that something is happening.

#/usr/local/apache/bin/logresolve <  \
#   /web_disc/ccp14/web-logs/october2002_unresolved.log > \
#   /web_disc/ccp14/web-logs/october2002_resolved.log
#

#/usr/local/bin/http-analyze -I Nov/99 -E Dec/99 -m3f -g -S www.ccp14.ac.uk  \
#   -c /web_disc/ccp14/http-analyse/http-analyze2.3/config.file \
#   -o /web_disc/ccp14/web_area/stats \
#   /web_disc/ccp14/web-logs/october2002_resolved.log

/usr/local/bin/http-analyze -m3f -g -S www.ccp14.ac.uk  \
   -c /web_disc/ccp14/http-analyse/http-analyze2.3/config.file \
   -o /web_disc/ccp14/web_area/stats \
   /web_disc/ccp14/web-logs/october2002_resolved.log


#Filtered page excluding non edu and no resolved ip addresses.
#  http-analyze options
# -vvvm3f  is verbose
# -m3f    is not verbose

grep -v -i -e ".com - - \[" -e ".net - - \["  -e ".co.uk - - \[" \
           -e ".co.jp - - \[" -e ".com.au - - \[" -e ".com.br - - \[" \
           -e "0 - - \["   -e "1 - - \["   -e "2 - - \[" -e "3 - - \[" -e "4 - - \["  \
           -e "5 - - \["   -e "6 - - \["   -e "7 - - \[" -e "8 - - \[" -e "9 - - \["  \
     /web_disc/ccp14/web-logs/october2002_resolved.log > \
     /web_disc/ccp14/web-logs/filtered/october2002_resolved.log_educational

#/usr/local/bin/http-analyze -I Nov/99 -E Dec/99 -m3f -g -S www.ccp14.ac.uk  \
#   -c /web_disc/ccp14/http-analyse/http-analyze2.3/config.file.filtered \
#   -o /web_disc/ccp14/web_area/stats/filtered   \
#  /web_disc/ccp14/web-logs/filtered/october2002_resolved.log_educational

/usr/local/bin/http-analyze -m3f -g -S www.ccp14.ac.uk  \
   -c /web_disc/ccp14/http-analyse/http-analyze2.3/config.file.filtered \
   -o /web_disc/ccp14/web_area/stats/filtered   \
  /web_disc/ccp14/web-logs/filtered/october2002_resolved.log_educational


grep '/crystals/' /web_disc/ccp14/web-logs/october2002_resolved.log > \
   /web_disc/ccp14/web-logs/custom_logs_dir/crystals_web_october2002.log

grep -v '/llnlrupp/' /web_disc/ccp14/web-logs/custom_logs_dir/crystals_web_october2002.log > \
   /web_disc/ccp14/web-logs/custom_logs_dir/crystals_web_october2002a.log


grep '/web-mirrors/zefsa/' /web_disc/ccp14/web-logs/october2002_resolved.log > \
   /web_disc/ccp14/web-logs/custom_logs_dir/zefsa_web_october2002.log

grep '/ftp-mirror/programming/tcltk/' /web_disc/ccp14/web-logs/october2002_resolved.log > \
   /web_disc/ccp14/web-logs/custom_logs_dir/tcltk_web_october2002.log

grep '/lmgp' /web_disc/ccp14/web-logs/october2002_resolved.log > \
   /web_disc/ccp14/web-logs/custom_logs_dir/lmgp_web_october2002.log

grep '/platon-spek/' /web_disc/ccp14/web-logs/october2002_resolved.log > \
   /web_disc/ccp14/web-logs/custom_logs_dir/spek_web_october2002.log

grep -E  'people/lachlan/' /web_disc/ccp14/web-logs/october2002_resolved.log > \
   /web_disc/ccp14/web-logs/custom_logs_dir/lachlan_web_october2002.log

grep -E  '/tutorial/' /web_disc/ccp14/web-logs/october2002_resolved.log > \
   /web_disc/ccp14/web-logs/custom_logs_dir/tutorials_web_october2002.log

grep -E  '/solution/' /web_disc/ccp14/web-logs/october2002_resolved.log > \
   /web_disc/ccp14/web-logs/custom_logs_dir/solutions_web_october2002.log

grep -E  '/crys-r-shirley/' /web_disc/ccp14/web-logs/october2002_resolved.log > \
   /web_disc/ccp14/web-logs/custom_logs_dir/crysfire_download_october2002.log

grep -E  '/gsas/' /web_disc/ccp14/web-logs/october2002_resolved.log > \
   /web_disc/ccp14/web-logs/custom_logs_dir/gsas_download_october2002.log

grep -E  '/fullprof/' /web_disc/ccp14/web-logs/october2002_resolved.log > \
   /web_disc/ccp14/web-logs/custom_logs_dir/fullprof_download_october2002.log

grep -E  '/ccp14admin/' /web_disc/ccp14/web-logs/october2002_resolved.log > \
   /web_disc/ccp14/web-logs/custom_logs_dir/ccp14admin_october2002.log

grep -E  .ac.uk\ \-\ \- /web_disc/ccp14/web-logs/october2002_resolved.log > \
   /web_disc/ccp14/web-logs/custom_logs_dir/acuk_october2002.log

grep -E  '/wulffman/' /web_disc/ccp14/web-logs/october2002_resolved.log > \
   /web_disc/ccp14/web-logs/custom_logs_dir/wulffman_october2002.log

grep -E  '/briantoby/' /web_disc/ccp14/web-logs/october2002_resolved.log > \
   /web_disc/ccp14/web-logs/custom_logs_dir/briantoby_october2002.log

grep -E  'favicon.ico' /web_disc/ccp14/web-logs/october2002_resolved.log > \
   /web_disc/ccp14/web-logs/custom_logs_dir/favicon_october2002.log

grep -E  '/ccp/web-mirrors/dbws/downloads/' \
    /web_disc/ccp14/web-logs/october2002_resolved.log > \
   /web_disc/ccp14/web-logs/custom_logs_dir/dbws_october2002.log

grep -E  '/objcryst/' /web_disc/ccp14/web-logs/october2002_resolved.log > \
   /web_disc/ccp14/web-logs/custom_logs_dir/objcryst_october2002.log


/usr/local/bin/http-analyze -m3f -g -S www.ccp14.ac.uk  \
   -c /web_disc/ccp14/http-analyse/http-analyze2.3/config.file \
   -o /web_disc/ccp14/web_area/stats/crystals   \
  /web_disc/ccp14/web-logs/custom_logs_dir/crystals_web_october2002a.log

/usr/local/bin/http-analyze -m3f -g -S www.ccp14.ac.uk  \
   -c /web_disc/ccp14/http-analyse/http-analyze2.3/config.file \
   -o /web_disc/ccp14/web_area/stats/zefsa   \
  /web_disc/ccp14/web-logs/custom_logs_dir/zefsa_web_october2002.log

/usr/local/bin/http-analyze -m3f -g -S www.ccp14.ac.uk  \
   -c /web_disc/ccp14/http-analyse/http-analyze2.3/config.file \
   -o /web_disc/ccp14/web_area/stats/tcltk   \
  /web_disc/ccp14/web-logs/custom_logs_dir/tcltk_web_october2002.log

/usr/local/bin/http-analyze -m3f -g -S www.ccp14.ac.uk  \
   -c /web_disc/ccp14/http-analyse/http-analyze2.3/config.file \
   -o /web_disc/ccp14/web_area/stats/lmgp   \
  /web_disc/ccp14/web-logs/custom_logs_dir/lmgp_web_october2002.log

/usr/local/bin/http-analyze -m3f -g -S www.ccp14.ac.uk  \
   -c /web_disc/ccp14/http-analyse/http-analyze2.3/config.file \
   -o /web_disc/ccp14/web_area/stats/spek   \
  /web_disc/ccp14/web-logs/custom_logs_dir/spek_web_october2002.log

/usr/local/bin/http-analyze -m3f -g -S www.ccp14.ac.uk  \
   -c /web_disc/ccp14/http-analyse/http-analyze2.3/config.file \
   -o /web_disc/ccp14/web_area/stats/statslachlan   \
  /web_disc/ccp14/web-logs/custom_logs_dir/lachlan_web_october2002.log


/usr/local/bin/http-analyze -m3f -g -S www.ccp14.ac.uk  \
   -c /web_disc/ccp14/http-analyse/http-analyze2.3/config.file \
   -o /web_disc/ccp14/web_area/stats/statstutorials   \
  /web_disc/ccp14/web-logs/custom_logs_dir/tutorials_web_october2002.log

/usr/local/bin/http-analyze -m3f -g -S www.ccp14.ac.uk  \
   -c /web_disc/ccp14/http-analyse/http-analyze2.3/config.file \
   -o /web_disc/ccp14/web_area/stats/statssolutions   \
  /web_disc/ccp14/web-logs/custom_logs_dir/solutions_web_october2002.log


/usr/local/bin/http-analyze -m3f -g -S www.ccp14.ac.uk  \
   -c /web_disc/ccp14/http-analyse/http-analyze2.3/config.file \
   -o /web_disc/ccp14/web_area/stats/crysfire   \
  /web_disc/ccp14/web-logs/custom_logs_dir/crysfire_download_october2002.log

/usr/local/bin/http-analyze -m3f -g -S www.ccp14.ac.uk  \
   -c /web_disc/ccp14/http-analyse/http-analyze2.3/config.file \
   -o /web_disc/ccp14/web_area/stats/gsas   \
  /web_disc/ccp14/web-logs/custom_logs_dir/gsas_download_october2002.log


/usr/local/bin/http-analyze -m3f -g -S www.ccp14.ac.uk  \
   -c /web_disc/ccp14/http-analyse/http-analyze2.3/config.file \
   -o /web_disc/ccp14/web_area/stats/fullprof   \
  /web_disc/ccp14/web-logs/custom_logs_dir/fullprof_download_october2002.log

/usr/local/bin/http-analyze -m3f -g -S www.ccp14.ac.uk  \
   -c /web_disc/ccp14/http-analyse/http-analyze2.3/config.file \
   -o /web_disc/ccp14/web_area/stats/ccp14admin   \
  /web_disc/ccp14/web-logs/custom_logs_dir/ccp14admin_october2002.log

/usr/local/bin/http-analyze -m3f -g -S www.ccp14.ac.uk  \
   -c /web_disc/ccp14/http-analyse/http-analyze2.3/config.file.acuk \
   -o /web_disc/ccp14/web_area/stats/acuk   \
  /web_disc/ccp14/web-logs/custom_logs_dir/acuk_october2002.log

/usr/local/bin/http-analyze -m3f -g -S www.ccp14.ac.uk  \
   -c /web_disc/ccp14/http-analyse/http-analyze2.3/config.file \
   -o /web_disc/ccp14/web_area/stats/toby   \
  /web_disc/ccp14/web-logs/custom_logs_dir/briantoby_october2002.log

/usr/local/bin/http-analyze -m3f -g -S www.ccp14.ac.uk  \
   -c /web_disc/ccp14/http-analyse/http-analyze2.3/config.file \
   -o /web_disc/ccp14/web_area/stats/wulffman   \
  /web_disc/ccp14/web-logs/custom_logs_dir/wulffman_october2002.log



/usr/local/bin/http-analyze -m3f -g -S www.ccp14.ac.uk  \
   -c /web_disc/ccp14/http-analyse/http-analyze2.3/config.file \
   -o /web_disc/ccp14/web_area/stats/favicon   \
  /web_disc/ccp14/web-logs/custom_logs_dir/favicon_october2002.log

/usr/local/bin/http-analyze -m3f -g -S www.ccp14.ac.uk  \
   -c /web_disc/ccp14/http-analyse/http-analyze2.3/config.file \
   -o /web_disc/ccp14/web_area/stats/dbws   \
  /web_disc/ccp14/web-logs/custom_logs_dir/dbws_october2002.log

/usr/local/bin/http-analyze -m3f -g -S www.ccp14.ac.uk  \
   -c /web_disc/ccp14/http-analyse/http-analyze2.3/config.file \
   -o /web_disc/ccp14/web_area/stats/objcryst   \
  /web_disc/ccp14/web-logs/custom_logs_dir/objcryst_october2002.log


Examining the webstats

The webstats are visible for looking at. How you interpret them is up to the beholder.


Disorder Errors in Redhat Linux

Date: Mon, 06 Jan 2003 10:30:08 +0100
From: Stefan Stapelberg [stefan@RENT-A-GURU.DE]
Organization: RENT-A-GURU Heidelberg
To: l.m.d.cranswick@dl.ac.uk
Subject: Re: http-analyze Bug Report

Hello Lachlan,

> "Disorder detected" errors.
> 
> With http - I am moving from an SGI IRIX server to a Redhat Linux server.
> (this is for www.ccp14.ac.uk)
> 
> When I try to run the latest (and earlier) versins of http-analyze on
> the apache 2.0x weblog files - I get "Disorder detected" errors.
> 
> But using the exact same log file on the old SGI machine - no problems.

disorder is most often caused by multi-threaded web servers, which log
requests with older time-stamps after requests with newer timestamps.
If this appears on the same logfile which makes no problems on IRIX,
then the C library functions (probably mktime) of Linux are buggy.
This is nothing new for me. Linux is just not comparable to IRIX
regarding quality of the software IMHO.

If you repeat the -v option several times, debugging output will
become more verbose and http-analyze shows the "ticks" created from
the timestamp of a logfile entry (warning: output data can become
very huge for big logfiles).

Since the disorder appears only at certain day wraps, I wouldn't
care if it happens only at a few entries. It can be a problem if
it appears every minute or so.

Furthermore, results are not necessarily wrong by a disorder.
The entries with older dates will be counted in the usual manner.
They only don't get counted correctly in the "Average Hits by
hour" and the "Top XX hours/minutes/seconds of the period"
tables.

> Is there something wonky with Redhat Linux that there is a workaround for?
> Command I am using and error output is below.  I have also tried renewing all
> the config files as well (the scripts all point to http-analyze2.3/config.file)
> But this should be for the latest version of http-analyze.

If you really care to have correct tables of the above values,
you could use the ha-sort utility to chronologically sort the
data stream before analyzing it. The dis-advantage of ha-sort
is that it needs to work on the raw, uncompressed logfiles
(the next version of http-analyze will directly process
compressed logfiles without having to run gzip).

With Netscape web servers on IRIX I get sometimes such a disorder
caused by the multi-threading server when there is heavy access.
I personally just throw away the warning message, since this
happens only with very few logfile entries.

Hope this helps. Please let me know if you want to try ha-sort.

Best wishes,
Stefan

-- 
Stefan Stapelberg         RENT-A-GURU® INET-TV®/INET-RADIO® NETSTORE®
RAG3-RIPE                 Neuer Weg 16 · D-69118 Heidelberg · Germany
http://www.netstore.de/   Phone: +49.6221.803802 · Fax: +49.6221.803899
Todays spezial offer:     Microsoft spel chekar vor sail, worgs grate!!


[CCP14 Home: (Frames | No Frames)]
CCP14 Mirrors: [UK] | [CA] | [US] | [AU]

(This Webpage Page in No Frames Mode)

If you have any queries or comments, please feel free to contact the CCP14