Skip to main content

Log Aggregation with ELK stack and Spring Boot

Introduction

In order to be able to search our logs based on a key/value pattern, we need to prepare our application to log and send information in a structured way to our log aggregation tool.

In this article I am going to show you how to send structured log to ElasticSearch using Logstash as a data pipeline tool and how to visualize and filter log information using Kibana.

According to a definition from the Wikipedia website:
Elasticsearch is a search engine based on the Lucene library. It provides a distributed, multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents.
According to Elasticsearch platform website, Elasticsearch is the heart of the Elastic stack, which centrally stores your data for lightning fast search. The use of Elasticsearch, Kibana, Beats and Logstash as a search platform is commonly known as the ELK stack.

Next we are going to start up Elasticsearch, Kibana and Logstash using docker so we can better understand and visualize what this is all about.

Running the ELK stack using docker


Before running our containers with the images of each component of the stack, let's first save some minutes here to better understand what we are doing and what is the role of each service in the ELK stack.

As we mentioned before, Elasticsearch is the heart of the system where all the information are saved using the Apache Lucene as the text indexing and searching engine. But what about Logstach? Logstash is responsible to ingest data from different sources, applying any required transformations before sending it to the "stash", which in our case is the Elasticsearch itself. Kibana is the UI tool used to search for all text indexed information. It is the part of the system that we can see and interact as users of the platform looking for any text logged information in our application.

So, lets run our ELK stack containers now to see all this in action.
docker-compose up -d
If everything went well you should be able to run the docker-compose ps command to find all your containers in a Up state. You can always run docker-compose logs to check the logged information of all services and to visualize if any errors have happended during its startup.

Reaching the Elasticserach API

With all services up and running, it's time to get to know how to access and check that the basic functionality of our services are working fine. Let's first check if we can reach the Elasticsearch service at the following url:
http://localhost:9200
You should be able to see the following response from the Elasticsearch API showing the famous tagline "You Know, for Search", demonstrating that the service is up and running fine:
Go to the following uri to access the Kibana UI Home page:
http://localhost:5601/app/home/
The last thing we need to check for our system to work is Logstash. We need to know whether the service was able to create the logstash index that we need in order to be able to access our logs from Kibana. We can check all Elasticsearch index by visiting the following url:
http://localhost:9200/_cat/indices
If everything was configured correctly we should be able to see an image similar to this one, showing our logstash index.

Creating the Spring Boot application


Go to the Spring Initializr website and add the following initial dependencies to the project:
https://start.spring.io/

Using Logback


According to the project website, Logback is a successor to the log4j project and since we are using the Spring Boot Starter Web module, we already have Logback configure for logging. We just need to define the logging level and the logging file name in the application.properties file.
logging.level.org.springframework.web=INFO
logging.file.name=logs/application.log

Creating a simple REST Controller


Let's now create a simple rest controller to log all customers calls.

Adding structured logging


If we want to be able to search our logs based on some key-value properties we need to add some kind of structure to our logs. Logback provide a factory class called StructuredArguments for creating structured arguments logs. This works pretty much like a map. We just need to call the keyvalue method from the class to pass the key and value we want to log in a structured way.

Add the following code to the CustomerController to get a Logger instance:
private final Logger logger = LoggerFactory.getLogger(this.getClass());

To be able to use the methods of the StructuredArguments class, we need to add the following dependency to our build.gradle file:
implementation 'net.logstash.logback:logstash-logback-encoder:7.3'

With that added to our code we have everything we need to start using one of the methods of the class. I am going to use the keyValue method here, but the class also provides a simplified version of it, which is just a convenience method that will call the keyValue method.

The final version of the CustomerController class should look like this:

Adding Logstash pipeline file to the project


Next, we need to add the Logstash file with the configurations of the three stages of the pipeline: input, filter and output. This file will represent our data pipeline with the informations of where the data is coming from and where are we going to send it after all transformations applied by the filters.

Create an elk folder in the root of your project with the following structure:
Inside the pipeline folder, add the following logstash.conf file to it:


In the output section, change the IP address of the "hosts" property of the elasticsearch object to point to your own. You can find all your network interfaces by running the following command in Mac or the equivalent one in windows:
ifconfig
Before creating an index pattern in the Kibana UI, there is one more thing that we need to do. We need to change the permission of the application.log file. If we don't do that Logstash won't be able to write to our file and our Logstash index won't be created as well. To do that, go inside the logs directory and run the following command in your terminal.
chmod 777 ./application.log
Also notice that on the logstash volumes of the docker-compose file, I am mounting the project logs directory inside the logs directory of the logstash container using the absolute path to my folder. You need to change that path to match yours.
- /Users/cjafet/Downloads/elastic-stack/logs:/usr/share/logstash/logs/

Creating the index pattern in Kibana UI


We can now go to the following URL to access Kibana Index Patterns web page.
You should be able to see a similar page to the one below. Go ahead and click in the "Create index pattern" button:

In the text field of the index create page, type logstash-* to create an index pattern that can be matched to multiple data sources. You should see a page similar to this one:
In the next page select the @timestamp as the primary time field and click the Create index pattern button to create the index.
With the Spring Boot application running, go to the browser and access the following endpoint to generate our first log.
http://localhost:8080/api/customers/1
After a few minutes we should be able to see the logstash container logs, similar to the logs below:
To filter the logs using Kibana you can go to the following URL where we can see all logs from the selected date:
http://localhost:5601/app/discover#/
Here we can see that we got requests from two different customers. Since we have structured our application using the Logback StructuredArguments factory class, we can now take advantage of that to search logs based on the attributes available in our log messages. In the following image, we are searching for all logs that contain the key-value customerId=2 in the log messages.

Conclusion


There is much more we can do with the ELK stack. If you want to be able to send data from multiple servers you might wanna take a look at Beats. According to the Elastic website, "Beats is a free and open platform for single-purpose data shippers. They send data from hundreds or thousands of machines and systems to Logstash or Elasticsearch".

Comments

Popular posts from this blog

Selection Sort Explained

Introduction If you are trying to get a remote job in a top IT consulting company, you will definitely fall into a live code exercise where your algorithms, logical thinking and problem solving skills will be tested and you will have to demonstrate a solid knowledge of these concepts. Today I decided to write about a type of sorting algorithm that I found several times in interviews and decided, after studying the approach used, to create an initial solution in the simplest possible way. Understanding the logic As we know, the sort algorithm basically uses three basic principles to sort the items in a list. A comparator, a swap function, and recursion. For this selection sort algorithm I will focus in the first two. Given that we have the following list of numbers: 64, 25, 12, 22, 11, how would we use selection sort to swap and sort the list in an ascending order? The following code from the init function uses two for loops to create a temporary list (line 2) with the r

How to use Splunk SPL commands to write better queries - Part I

Introduction As a software engineer, we are quite used to deal with logs in our daily lives, but in addition to ensuring that the necessary logs are being sent by the application itself or through a service mesh, we often have to go a little further and interact with some log tool to extract more meaningful data. This post is inspired by a problem I had to solve for a client who uses Splunk as their main data analysis tool and this is the first in a series of articles where we will delve deeper and learn how to use different Splunk commands. Running Splunk with Docker To run Splunk with docker, just run the following command: docker run -d —rm -p 8000:8000 -e SPLUNK_START_ARGS=--accept-license -e SPLUNK_PASSWORD=SOME_PASSWORD --name splunk splunk/splunk:latest Sample Data We are going to use the sample data provided by Splunk. You can find more information and download the zip file from their web site . How does it work? In order to be able to interact with Splunk t