- Home
- Contact
-
Articles / Code
- HTML/CSS (2)
-
Scripts (17)
- Twitter & bash
- Generate sitemaps
- random floating point in bash
- Gmail script
- Python http POST requests
- Bashrc enhancements
- commandline <> nautilus
- Word Definitions
- Synonyms
- mysqld monitor
- remote server
- links and emailaddresses
- Apache Analyser
- remote ipaddress
- OOP Python
- mysql tuning
- diskspace notification
- Server Configuration (6)
- ICT-security (2)
Apache Analyser
» Articles / Code » Scripts » Apache Analyser
Getting usefull statistics from your Apache logs
If you're running Apache or a similar web server that uses the Common Log Format, there's quite a bit of quick statistical analysis that can be done with a shell script. The standard configuration for a server has an access_log and error_log written for the site; even ISPs make these raw data files available to customers, but if you've got your own server, you should definitely have and be archiving this valuable information.
This bash script returns valuable statistics about your website:
#!/bin/sh
# webaccess - analyze an Apache-format access_log file, extracting
# useful and interesting statistics
bytes_in_gb=1048576
scriptbc="$HOME/bin/scriptbc"
nicenumber="$HOME/bin/nicenumber"
host="intuitive.com"
if [ $# -eq 0 -o ! -f "$1" ] ; then
echo "Usage: $(basename $0) logfile" >&2
exit 1
fi
firstdate="$(head -1 "$1" | awk '{print $4}' | sed 's/\[//')"
lastdate="$(tail -1 "$1" | awk '{print $4}' | sed 's/\[//')"
echo "Results of analyzing log file $1"
echo ""
echo " Start date: $(echo $firstdate|sed 's/:/ at /')"
echo " End date: $(echo $lastdate|sed 's/:/ at /')"
hits="$(wc -l < "$1" | sed 's/[^[:digit:]]//g')"
echo " Hits: $($nicenumber $hits) (total accesses)"
pages="$(grep -ivE '(.txt|.gif|.jpg|.png)' "$1" | wc -l | sed 's/[^[:digit:]]//g')"
echo " Pageviews: $($nicenumber $pages) (hits minus graphics)"
totalbytes="$(awk '{sum+=$10} END {print sum}' "$1")"
echo -n " Transferred: $($nicenumber $totalbytes) bytes "
if [ $totalbytes -gt $bytes_in_gb ] ; then
echo "($($scriptbc $totalbytes / $bytes_in_gb) GB)"
elif [ $totalbytes -gt 1024 ] ; then
echo "($($scriptbc $totalbytes / 1024) MB)"
else
echo ""
fi
# now let's scrape the log file for some useful data:
echo ""
echo "The ten most popular pages were:"
awk '{print $7}' "$1" | grep -ivE '(.gif|.jpg|.png)' | \
sed 's/\/$//g' | sort | \
uniq -c | sort -rn | head -10
echo ""
echo "The ten most common referrer URLs were:"
awk '{print $11}' "$1" | \
grep -vE "(^"-"$|/www.$host|/$host)" | \
sort | uniq -c | sort -rn | head -10
echo ""
exit 0
Post your comment
Comments
No one has commented on this page yet.
RSS feed for comments on this page | RSS feed for all comments