Warning this article may contain opinions of the author that you and iTWire don't agree with.
Visit the last page to have your say in our forum.

No. 1 Story

Telstra adds one million mobile services, but Sensis plummets

Telstra has revealed the addition of almost one million new mobile services in the six months to December 2011, but Sensis revenues plummeted 24 percent in 12 months.

read more

Build better blogs with Linux

Opinion and Analysis

Your web server is storing a rich lode of data on who has visited you, from where, which pages they visited and so much more. You can find out what are the most popular pages on your site, the Google search terms being used to find you, and on and on.

One of the most well-known log file analysis tools is called analog which natively parses log files from the Apache web server, as well as a variety of other common log file formats.

Begin with an overview of your web site statistics with a command like this:
analog –A /etc/httpd/logs/access_log > /tmp/analog.html && firefox /tmp/analog.html

This causes analog to parse the Apache log at /etc/httpd/logs/access_log – which may be different on your system. It generates a HTML-based report and then displays this in the Firefox web browser.

The output will give you overview numbers, but they are very interesting. You will see how many successful requests your site has had as well as how many failed requests there were. You will see how many distinct hosts visited you and how much data you transferred along with other items.

Other reports analog can give you include search queries – namely terms entered into search engines which lead to you – as well as the Request Report which identifies the most popular files and pages on your site. There’s a lot of punch in analog; check out the online documentation for inspiration and guidance.

Actually, for some things you don’t need to use a package like analog if you’re just after real quick data. The regular assortment of Linux text tools like cut and sort and grep and uniq and sed and awk are all available for your pleasure.

You can lookup all the computers that have visited you with a simple command like this one:
cut –d “ “ –f1 /etc/httpd/logs/access_log

(provided you are using the default Apache log file format, which stamps the front of each line with an IP address. If you are using a customised log format you will need to modify the command appropriately.)

That command uses a basic Linux staple, cut, which essentially cuts fields out of text files. It doesn’t change the text file itself, it just displays the modified output. The parameters above say to retrieve the first field (-f1) and that the space character delimits fields (-d “ “.)

So there are some real simple – but remarkably powerful and effective – Linux tips which will give you greater web site performance as well as deliver enormous amounts of data on your visitors and popular pages.

What other sorts of things in Linux are of interest to you? What would you like to hear about?

Loading comments ...



- sponsored feature -

The Death of Traditional BI: What’s Next?

How to Make Business Discovery Work for Your Business IP PABX BUYING GUIDE

Business Discovery takes its cues from consumer apps. Like Google, it encourages us- ers to hunt for and explore data without worrying about or even noticing the underly- ing technology. Their entire experience is working within an intuitive interface to get real-time, self-service results with only minimal training. ...more