Skip to main content

Log Aggregation with ELK stack and Spring Boot

Introduction

In order to be able to search our logs based on a key/value pattern, we need to prepare our application to log and send information in a structured way to our log aggregation tool.

In this article I am going to show you how to send structured log to ElasticSearch using Logstash as a data pipeline tool and how to visualize and filter log information using Kibana.

According to a definition from the Wikipedia website:
Elasticsearch is a search engine based on the Lucene library. It provides a distributed, multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents.
According to Elasticsearch platform website, Elasticsearch is the heart of the Elastic stack, which centrally stores your data for lightning fast search. The use of Elasticsearch, Kibana, Beats and Logstash as a search platform is commonly known as the ELK stack.

Next we are going to start up Elasticsearch, Kibana and Logstash using docker so we can better understand and visualize what this is all about.

Running the ELK stack using docker


Before running our containers with the images of each component of the stack, let's first save some minutes here to better understand what we are doing and what is the role of each service in the ELK stack.

As we mentioned before, Elasticsearch is the heart of the system where all the information are saved using the Apache Lucene as the text indexing and searching engine. But what about Logstach? Logstash is responsible to ingest data from different sources, applying any required transformations before sending it to the "stash", which in our case is the Elasticsearch itself. Kibana is the UI tool used to search for all text indexed information. It is the part of the system that we can see and interact as users of the platform looking for any text logged information in our application.

So, lets run our ELK stack containers now to see all this in action.
docker-compose up -d
If everything went well you should be able to run the docker-compose ps command to find all your containers in a Up state. You can always run docker-compose logs to check the logged information of all services and to visualize if any errors have happended during its startup.

Reaching the Elasticserach API

With all services up and running, it's time to get to know how to access and check that the basic functionality of our services are working fine. Let's first check if we can reach the Elasticsearch service at the following url:
http://localhost:9200
You should be able to see the following response from the Elasticsearch API showing the famous tagline "You Know, for Search", demonstrating that the service is up and running fine:
Go to the following uri to access the Kibana UI Home page:
http://localhost:5601/app/home/
The last thing we need to check for our system to work is Logstash. We need to know whether the service was able to create the logstash index that we need in order to be able to access our logs from Kibana. We can check all Elasticsearch index by visiting the following url:
http://localhost:9200/_cat/indices
If everything was configured correctly we should be able to see an image similar to this one, showing our logstash index.

Creating the Spring Boot application


Go to the Spring Initializr website and add the following initial dependencies to the project:
https://start.spring.io/

Using Logback


According to the project website, Logback is a successor to the log4j project and since we are using the Spring Boot Starter Web module, we already have Logback configure for logging. We just need to define the logging level and the logging file name in the application.properties file.
logging.level.org.springframework.web=INFO
logging.file.name=logs/application.log

Creating a simple REST Controller


Let's now create a simple rest controller to log all customers calls.

Adding structured logging


If we want to be able to search our logs based on some key-value properties we need to add some kind of structure to our logs. Logback provide a factory class called StructuredArguments for creating structured arguments logs. This works pretty much like a map. We just need to call the keyvalue method from the class to pass the key and value we want to log in a structured way.

Add the following code to the CustomerController to get a Logger instance:
private final Logger logger = LoggerFactory.getLogger(this.getClass());

To be able to use the methods of the StructuredArguments class, we need to add the following dependency to our build.gradle file:
implementation 'net.logstash.logback:logstash-logback-encoder:7.3'

With that added to our code we have everything we need to start using one of the methods of the class. I am going to use the keyValue method here, but the class also provides a simplified version of it, which is just a convenience method that will call the keyValue method.

The final version of the CustomerController class should look like this:

Adding Logstash pipeline file to the project


Next, we need to add the Logstash file with the configurations of the three stages of the pipeline: input, filter and output. This file will represent our data pipeline with the informations of where the data is coming from and where are we going to send it after all transformations applied by the filters.

Create an elk folder in the root of your project with the following structure:
Inside the pipeline folder, add the following logstash.conf file to it:


In the output section, change the IP address of the "hosts" property of the elasticsearch object to point to your own. You can find all your network interfaces by running the following command in Mac or the equivalent one in windows:
ifconfig
Before creating an index pattern in the Kibana UI, there is one more thing that we need to do. We need to change the permission of the application.log file. If we don't do that Logstash won't be able to write to our file and our Logstash index won't be created as well. To do that, go inside the logs directory and run the following command in your terminal.
chmod 777 ./application.log
Also notice that on the logstash volumes of the docker-compose file, I am mounting the project logs directory inside the logs directory of the logstash container using the absolute path to my folder. You need to change that path to match yours.
- /Users/cjafet/Downloads/elastic-stack/logs:/usr/share/logstash/logs/

Creating the index pattern in Kibana UI


We can now go to the following URL to access Kibana Index Patterns web page.
You should be able to see a similar page to the one below. Go ahead and click in the "Create index pattern" button:

In the text field of the index create page, type logstash-* to create an index pattern that can be matched to multiple data sources. You should see a page similar to this one:
In the next page select the @timestamp as the primary time field and click the Create index pattern button to create the index.
With the Spring Boot application running, go to the browser and access the following endpoint to generate our first log.
http://localhost:8080/api/customers/1
After a few minutes we should be able to see the logstash container logs, similar to the logs below:
To filter the logs using Kibana you can go to the following URL where we can see all logs from the selected date:
http://localhost:5601/app/discover#/
Here we can see that we got requests from two different customers. Since we have structured our application using the Logback StructuredArguments factory class, we can now take advantage of that to search logs based on the attributes available in our log messages. In the following image, we are searching for all logs that contain the key-value customerId=2 in the log messages.

Conclusion


There is much more we can do with the ELK stack. If you want to be able to send data from multiple servers you might wanna take a look at Beats. According to the Elastic website, "Beats is a free and open platform for single-purpose data shippers. They send data from hundreds or thousands of machines and systems to Logstash or Elasticsearch".

Comments

Popular posts from this blog

How to use Splunk SPL commands to write better queries - Part I

Introduction As a software engineer, we are quite used to deal with logs in our daily lives, but in addition to ensuring that the necessary logs are being sent by the application itself or through a service mesh, we often have to go a little further and interact with some log tool to extract more meaningful data. This post is inspired by a problem I had to solve for a client who uses Splunk as their main data analysis tool and this is the first in a series of articles where we will delve deeper and learn how to use different Splunk commands. Running Splunk with Docker To run Splunk with docker, just run the following command: docker run -d —rm -p 8000:8000 -e SPLUNK_START_ARGS=--accept-license -e SPLUNK_PASSWORD=SOME_PASSWORD --name splunk splunk/splunk:latest Sample Data We are going to use the sample data provided by Splunk. You can find more information and download the zip file from their web site . How does it work? In order to be able to interact with Splunk t...

How to run OPA in Docker

From the introduction of the openpolicyagent.org site: OPA generates policy decisions by evaluating the query input against policies and data. In this post i am going to show you an easy and fast way to test your policies by running OPA in Docker. First, make sure you have already installed Docker and have it running: docker ps Inside your choosen directory, create two files. One called input.json file for your system representation and one file called example.rego for your rego policy rules. Add the following content to your json file: Add the following content for the example.rego: Each violation block represents the rule that you want to validate your system against. The first violation block checks if any of the system servers have the http protocol in it. If that is the case, the server id is added to the array. In the same way, the second violation block checks for the servers that have the telnet protocol in it and if it finds a match the server id is also...

API GATEWAY: THE CLOUDFRONT 403 FORBIDDEN ERROR

If you are having a 403 Forbidden error from CloudFront , that means your domain name is not linked to your CloudFront distribution and because CloudFront stays in front of your API GATEWAY you need to create a CNAME record pointing your domain name to your CloudFront target domain name in order for it to work. So, If you need to point your api to a custom domain name, all you have to do is following those 2 easy steps: 1 - CREATING YOUR CUSTOM DOMAIN NAME Go to the API GATEWAY console and click on the Custom Domain Name menu. Click on the Create Custom Domain Name button. Next, assign a certificate matching the same domain name you are creating and map to the root path and destination of your desired api. Lastly, copy the CloudFront Target Domain Name . You will need to paste that in your Route 53 record. 2 - CREATING A CNAME RECORD ON ROUTE 53 Create a CNAME record on Route 53 for the same custom domain name, assigning to it the CloudFront Target Domain. CONCLUSION ...