|
What’s in Those Web Server Logs? Log files are no fun to look at. They are ugly, contain too much information, and lead to massive headaches. Fortunately these beasts can be useful in more capacities than just debugging; they can be used to generate wonderful reports that make sense. Quite a few different programs exist to analyze your web server logs, and we would like to bring a few of them into the spotlight.
Before exploring what different analysis software offers, we should first decide what type of data we want to see. Most of the software mentioned supports more than just web server logs, but for now we’ll just be focusing on web server output. Log analysis programs can show everything from a list of IP addresses that connected to your web server, to a pie chart showing which files were accessed most. The majority of popular web log analysis tools try to make sense out of every piece of data in the logs, and few succeed in making the data readable. Some log file analysis packages lack the ability to distinguish pertinent information from the raw log file itself. Displaying statistics in an aesthetically pleasing manner is very important virtue. Every once in a while, GUI designers create a new paradigm, setting a standard that everyone attempts to emulate. Arguably, Apple Corporation has done this with their OS X desktop environment. Equally arguable, awstats is the log file analysis program that everyone yearns for. But before jumping into the shining example, let’s take a look at all of the contenders. Webalizer is a popular log analysis tool. Many people prefer it because it is written in C, and runs quite fast. The graphics, however, are not optimal. Sure, there are readable charts, which come from the gd graphics library, but they just aren’t as pretty as they could be. The reports produced by webalizer are sufficient for getting a quick glimpse of a few important data points; namely “what pages are accessed” and “how many hits are we getting.” As we will see later (with awstats), there is a wealth of information to be extracted from web logs, and when done properly the information doesn’t have to be overwhelming. Webalizer is nice, but because of its mediocre graphics and lack of statistics, we’d rate it 3 out of 5. Analog, favored by an elite group of die-hard fans, is another worthy contender. Analog attempts to present everything, but is an example of how to include too much information for normal human consumption. By default, everything is displayed on the same webpage. There is a navigation bar at the top that allows viewers to click on a specific report, which drills down to another section of the page. The only thing that saves Analog is the navigation bar that exists at the top of each section, allowing for pseudo-ease of navigation. Some of the interesting reports that Analog presents include: listing how many hits come from each country (TLD, actually), search engine queries that brought them to your website, what browsers/operating systems visitors were using, and just about everything else that’s derivable from web server logs. The graphics are a slight improvement over webalizer’s gd-based graphics, but the pie and bar charts still leave a lot to be desired. Because Analog includes lots of useful information, and the navigation isn’t completely unusable, we feel it deserves an apprehensive 4, out of 5. Summary is a non-free (30 day trial offered) log analysis tool. This package includes all information possible, listing it in a text webpage for users to click on. When you follow a link, for example, “Bandwidth Peak,” you are brought to fairly decent webpage that lists bandwidth usage by time. A small bar graph accompanies each entry, but the graphics in all of Summary are quite minimal. Minimal doesn’t mean bad, quite the contrary; Summary is really decent looking. The overall GUI is cumbersome, however. It takes a great amount of time to browse to each report you wish to see. The cost of Summary is not prohibitive, and the reports are decent, but not awe-inspiring. 4 out of 5. The sheer audacity of WebTrends’ web log analyzer earns it an honorable mention here. Their website makes auspicious claims of increasing ROI, and even asserts “This is Complete Web Analysis.” WebTrends is certainly not for the company with small wallets. The amazing thing is that their software really looks great. The demos available online reflect how great GUI design should look. I was taken aback by the apparent usability of this software, and it seems they even included all information available from web server logs. WebTrends has been around for more than a decade, and plays nicely with IIS. Also a 4 out of 5; but based solely on the impressive website demo of this product. The grail of log analysis, awstats, is by far the best looking of all free web log analysis tools. Awstats is also the only Perl-based application in the list. The graphics are superb; the information is presented in an excellent manner. Awstats is truly a remarkable accomplishment. At a glance, users can view all the reports available, and navigate seamlessly between them. Many users will be amazed at the amount of detail this program can extract from the log files. Little browser icons and flags for various countries add to this already pleasing GUI. Awstats includes all of the features listed above for other programs, and in a readable format. We have to give it 5 out of 5. There are countless other log analysis programs, but we’ve tried to include the most popular ones here. Compatibility isn’t a great concern. The Apache web server produces logs in a standardized format, called NCSA combined log files. IIS’s W3C conformant format is also supported by most of the analysis programs listed here. In a future article we will explore the other types of log files most of these programs can work on, including mail and ftp. |