How to run SQL queries against Apache log files on Linux

One of the distinguishing features of Linux is that, under normal circumstances, you should be able to know what is happening and has happened on your system by analyzing one or more system logs. Indeed, system logs are the first resource a system administrator tends to look to while troubleshooting system or application issues. In this article, we will focus on the Apache access log files generated by Apache HTTP web server. We will explore an alternative way of analyzing Apache access logs using asql, an open-source tool that allows one to run SQL queries against the logs in order to view the same information in a more friendly format.

Background on Apache Logs

There are two kinds of Apache logs:

  • Access log: Found at /var/log/apache2/access.log (for Debian) or /var/log/httpd/access_log (for Red Hat). Contains records of every request served by an Apache web server.
  • Error log: Found at /var/log/apache2/error.log (for Debian) or /var/log/httpd/error_log (for Red Hat). Contains records of all error conditions reported by an Apache web server. Error conditions include, but are not limited to, 403 (Forbidden, usually returned after a valid request missing access credentials or insufficient read permissions), and 404 (Not found, returned when the requested resource does not exist).

Although the verbosity of Apache access log file can be customized through Apache's configuration files, we will assume the default format in this article, which is as follows:

Remote IP - Request date - Request type - Response code - Requested resource - Remote browser (may also include operating system)

So a typical Apache log entry looks like:

192.168.0.101 - - [22/Aug/2014:12:03:36 -0300] "GET /icons/unknown.gif HTTP/1.1" 200 519 "http://192.168.0.10/test/projects/read_json/" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:30.0) Gecko/20100101 Firefox/30.0"

But what about Apache error log? Since error log entries dealing with particular requests have corresponding entries in the access log (which you can customize), you can use the access log file to obtain more information about error conditions (refer to example 5 for more details).

That being said, please note that access log is a system-wide log file. To find the log files of virtual hosts, you may also need to check their corresponding configuration files (e.g., within /etc/apache2/sites-available/[virtual host name] on Debian).

Installing asql on Linux

asql is written in Perl, and requires two Perl modules: a DBI driver for SQLite and GNU readline.

Install asql on Debian, Ubuntu or their derivatives

asql and its dependencies will automatically be installed with aptitude on Debian-based distributions.

# aptitude install asql

Install asql on Fedora, CentOS or RHEL

On CentOS or RHEL, you will need to enable EPEL repository first, and then run the commands below. On Fedora, proceed to the following commands directly.

# sudo yum install perl-DBD-SQLite perl-Term-ReadLine-Gnu
# wget http://www.steve.org.uk/Software/asql/asql-1.7.tar.gz
# tar xvfvz asql-1.7.tar.gz
# cd asql
# make install

How Does asql Work?

As you can guess from the dependencies listed above, asql converts unstructured plain-text Apache log files into a structured SQLite database, which can be queried using standard SQL commands. This database can be populated with the contents of current and past log files - including compressed rotated logs such as access.log.X.gz. or access_log.old.

First, launch asql from the command line with the following command

# asql

You will be entering asql's built-in shell interface.

Let's type help to list the available commands in the asql shell:

We will begin by loading all the access logs in asql, which can be done with:

asql> load </path/to/apache-access-logs>

In case of Debian, the following command will do:

asql> load /var/log/apache2/access.*

In case of CentOS/RHEL, use this command instead:

asql> load /var/log/httpd/access_log*

When asql finishes loading access logs, we can start querying the database. Note that the database created after loading is "temporary," meaning that if you exit the asql shell, the database will be lost. If you want to preserve the database, you have to save it to a file first. We will see how to do that later (refer to examples 3 and 4).

The database contains a table named logs. The available fields in the logs table can be displayed using the show command:

The .asql hidden file, which is stored in each user's home directory, records the history of the commands that were typed by the user in an asql shell. Thus, you can browse through it using the arrow keys, and repeat previous commands by just pressing ENTER when you find the right one.

SQL Query Examples with asql

Here a few examples of running SQL queries against Apache log files with asql.

Example 1: Listing the request sources / dates and HTTP status codes returned during the month of October 2014.

SELECT source, date, status FROM logs WHERE date >= '2014-10-01T00:00:00' ORDER BY source;

Example 2: Displaying the total size (in bytes) of requests served per client in descending order.

SELECT source,SUM(size) AS Number FROM logs GROUP BY source ORDER BY Number DESC;

Example 3: Saving the database to [filename] in the current working directory.

save [filename]

This allows us to avoid the need for waiting while the log parsing is performed with the load command as shown earlier.

Example 4: Restoring the database in a new asql session after exiting the current one.

restore [filename]

Example 5: Returning error conditions logged in the access file. In this example, we will display all the requests that returned a 403 (access forbidden) HTTP code.

SELECT source,date,status,request FROM logs WHERE status='403' ORDER BY date

This goes to show that although asql only analyzes access logs, we can use the status field of a request to display requests with error conditions.

Summary

We have seen how asql can help us analyze Apache logs and present the results in a user friendly output format. Although you could obtain similar results by using command line utilities such as cat in conjunction with grep, uniq, sort, and wc (to name a few), in comparison asql represents a Swiss army knife due to the fact that it allows us to use standard SQL syntax to filter the logs according to our needs.

Feel free to leave your questions or comments below. Hope it helps.

Subscribe to Xmodulo

Do you want to receive Linux FAQs, detailed tutorials and tips published at Xmodulo? Enter your email address below, and we will deliver our Linux posts straight to your email box, for free. Delivery powered by Google Feedburner.


Support Xmodulo

Did you find this tutorial helpful? Then please be generous and support Xmodulo!

The following two tabs change content below.
Gabriel Cánepa is a GNU/Linux sysadmin and web developer from Villa Mercedes, San Luis, Argentina. He works for a worldwide leading consumer product company and takes great pleasure in using FOSS tools to increase productivity in all areas of his daily work. When he's not typing commands or writing code or articles, he enjoys telling bedtime stories with his wife to his two little daughters and playing with them, the great pleasure of his life.

9 thoughts on “How to run SQL queries against Apache log files on Linux

  1. @John,
    Thanks for your comment!
    Feel free to share this article through your social network profiles.
    All the best.
    Gabriel

  2. Thank you so much!
    It was my problem about a year ago when I didn't know "asql". I've solved my problem with a few bash commands + Python script.
    "asql" is really light and useful.

  3. cool script!

    this is a very clever and useful way to use SQLite. Many times i have stared at log files, wondering how i would gain insight from them in a flexible, ad hoc way. this is it!

  4. @Morteza and @Sean,
    I appreaciate you taking the time to comment on this post. I'm glad you found asql useful. All the best! Gabriel

  5. Why don't you use the 'logger' to log your apps (including apache) in a rsyslog and then query it through SQL queries.
    Those are common tools...

  6. @Trizak,
    Thanks for your comment. Would you mind explaining how you would execute sql queries to rsyslog out of the box?
    Look forward to hearing from you.
    Best regards.

  7. Hi Gabriel... Thanks for this post. I am new to asql, and your article did give me a good start. While I am able to run the asql queries quite easliy now, there is one thing I am still struggling with. How can I save the output of the asql query directly to a csv file? Can I instruct the asql query to output the results directly into a csv file? pls help and provide a few examples. Thanks!!!

Leave a comment

Your email address will not be published. Required fields are marked *