Investigating with ELK 101

Investigate VPN logs through ELK.

ElasticStack Overview

Elastic stack is the collection of open source units linked together to help users take data from any source and format to perform a search, analyze and visualize the data in real-time.

Elasticsearch

Full-text search and analytics engine used to store JSON-formatted documents in which analyzes and performs correlation on the data supporting RESTFul API to interact.

Logstash

Logstash is a data processing engine used to take the data from different sources, apply the filter on it or normalize it, and then send it to the destination which could be Kibana or a listening port. A logstash configuration file is divided into three parts, as shown below.

The input part is where the user defines the source from which the data is being ingested. Logstash supports many input plugins as shown in the reference https://www.elastic.co/guide/en/logstash/8.1/input-plugins.html

The filter part is where the user specifies the filter options to normalize the log ingested above. Logstash supports many filter plugins as shown in the reference documentation https://www.elastic.co/guide/en/logstash/8.1/filter-plugins.html

The output part is where the user wants the filtered data to send. It can be a listening port, Kibana Interface, elasticsearch database, a file, etc. Logstash supports many Output plugins as shown in the reference documentation https://www.elastic.co/guide/en/logstash/8.1/output-plugins.html

Beats

Beats is a host-based agent known as Data-shippers that is used to ship/transfer data from the endpoints to elasticsearch. Each beat is a single-purpose agent that sends specific data to the elasticsearch:

Kibana

Kibana is a web-based data visualization that works with elasticsearch to analyze, investigate and visualize the data stream in real-time. It allows the users to create multiple visualizations and dashboards for better visibility.

How they work together:

  • Beats is a set of different data shipping agents used to collect data from multiple agents. Like Winlogbeat is used to collect windows event logs, Packetbeat collects network traffic flows.

  • Logstash collects data from beats, ports or files, etc., parses/normalizes it into field value pairs, and stores them into elasticsearch.

  • Elasticsearch acts as a database used to search and analyze the data.

  • Kibana is responsible for displaying and visualizing the data stored in elasticsearch. The data stored in elasticseach can easily be shaped into different visualizations, time charts, infographics, etc., using Kibana.

Kibana Overview

Discover tab within Kibana contains the logs being ingested manually or in real-time, the time-chart, normalized fields, etc. This tab is used mostly to search/investigate the logs using the search bar and filter options.

Some key information available in a dashboard interface are

  1. Logs (document): Each log here is also known as a single document containing information about the event. It shows the fields and values found in that document.

  2. Fields pane: Left panel of the interface shows the list of the fields parsed from the logs. We can click on any field to add the field to the filter or remove it from the search.

  3. Index Pattern: Let the user select the index pattern from the available list.

  4. Search bar: A place where the user adds search queries / applies filters to narrow down the results.

  5. Time Filter: We can narrow down results based on the time duration. This tab has many options to select from to filter/limit the logs.

  6. Time Interval: This chart shows the event counts over time.

  7. TOP Bar: This bar contains various options to save the search, open the saved searches, share or save the search, etc.

Each important element found in the Discover tab is briefly explained below:

Time Filter

The time filter allows to apply a log filter based on the time with many options to choose from.

Quick Select

The Quick Select tab provides multiple options to select from. The Refresh, Every option at the end allows to choose the time to refresh the logs continuously. If 5 seconds is set, the logs will refresh every 5 seconds automatically.

Timeline

The timeline pane provides an overview of the number of events that occurred for the time/date. Select the bar only to show the logs in that specified period. The count at the top left displays the number of documents/events it found in the selected time. This bar is also helpful in identifying the spike in the logs.

Index Pattern

Kibana, by default, requires an index pattern to access the data stored/being ingested in the elasticsearch. Index pattern tells Kibana which elasticsearch data to explore. Each Index pattern corresponds to certain defined properties of the fields. A single index pattern can point to multiple indices.

Each log source has a different log structure; therefore, when logs are ingested in the elasticsearch, they are first normalized into corresponding fields and values by creating a dedicated index pattern for the data source.

Left Panel - Fields

The left panel of the Kibana interface shows the list of the normalized fields it finds in the available documents/logs. Click on any field, and it will show the top 5 values and the percentage of the occurrence.

Use these values to apply filters to them. Clicking on the + button will add a filter to show the logs containing this value, and the - button will apply the filter on this value to show the results that do not have this value.

Add Filter Option

Add filter option under the search bar allows to apply a filter on the fields.

Create Table

By default, the documents are shown in raw form. Click on any document and select important fields to create a table showing only those fields. This method reduces the noise and makes it more presentable and meaningful.

Save the table format once it is created. It will then show the same fields every time a user logs into the dashboard.

Select the index vpn_connections and filter from 31st December 2021 to 2nd Feb 2022. How many hits are returned?

Which IP address has the max number of connections?

Which user is responsible for max traffic?

Create a table with the fields IP, UserName, Source_Country and save.

Apply Filter on UserName Emanda; which SourceIP has max hits?

On 11th Jan, which IP caused the spike observed in the time chart?

How many connections were observed from IP 238.163.231.224, excluding the New York state?

KQL Overview

KQL (Kibana Query Language) is a search query language used on the ingested logs/documents in the elasticsearch. Kibana also supports Lucene Query Language. Disable the KQL query:

With KQL, search for the logs in two different ways:

  • Free text search

  • Field-based search

Free text search allows users to search for the logs based on the text-only, meaning a simple search of the term security will return all the documents that contain this term, irrespective of the field.

One of the fields in the index Source_Country has the list of countries from where the VPN connections originated:

Searching for the text United States in the search bar returns all the logs that contain this term regardless of the place or the field. This search returned 2304 hits. This will not work if the term specified is only United because KQL looks for the whole term/word in the documents.

WILD CARD

KQL allows the wild card * to match parts of the term/word.

Using the wildcard with the term United to return all the results containing the term United and any other term. If the term United Nations was logged, it would also have returned those as a result of this wildcard.

Logical Operators (AND | OR | NOT)

KQL allows utilizing the logical operators in the search query.

1- OR Operator

Show logs containing United States or England:

2- AND Operator

Create a search to return logs containing the terms United States AND Virginia.

3- NOT Operator

Use NOT Operator to remove the particular term from the search results.

In the Field-based search, provide the field name and the value being looked for in the logs. This search has a special syntax as FIELD : VALUE. It uses a colon : as a separator between the field and the value.

Search Query: Source_ip : 238.163.231.224 AND UserName : Suleman

Explanation: Telling Kibana to display all the documents in which the field Source_ip contains the value 19.112.190.54 and UserName as Suleman

Reference to explore the other options of KQL: https://www.elastic.co/guide/en/kibana/7.17/kuery-query.html

Create a search query to filter out the logs from Source_Country as the United States and show logs from User James or Albert. How many returns were returned?

As User Johny Brown was terminated on 1st January 2022, create a search query to determine how many times a VPN connection was observed after his termination.

Creating Visualizations

The visualization tab allows for visualizing data in different forms: Table, Pie charts, Bar charts, etc.

Create Visualization

One of the few ways to navigate the visualization tab is to click on any field in the discover tab and click on the visualization:

Correlation Option

Often, creating correlations between multiple fields is required. Dragging the required field in the middle will create a correlation tab in the visualization tab:

Can also create a table to show the values of the selected fields as columns:

The most important step in creating these visualizations is to save them. Click on the save Option on the right side and fill in the descriptive values below. Add these visualizations to already existing dashboard(s), or create a new one.

Steps to take after creating Visualizations:

  • Create a visualization and Click on the Save button at the top right corner.

  • Add the title and description to the visualization.

  • We can add the visualization to any existing Dashboard or a new dashboard.

  • Click Save and add to the library when it's done.

Failed Connection Attempts

Utilize the knowledge gained above to create a table to display the user and the IP address involved in failed attempts.

Which user was observed with the greatest number of failed attempts?

Simon

How many wrong VPN connection attempts were observed in January?

274

Creating Dashboards

A user can create multiple dashboards to fulfil a specific need to provide good visibility on the logs collection.

Creating Custom Dashboard

The steps to create a dashboard are:

  • Go to the Dashboard tab and click on the Create dashboard.

  • Click on Add from Library.

  • Click on the visualizations and saved searches. It will be added to the dashboard.

  • Once the items are added, adjust them accordingly, as shown below.

  • Don't forget to save the dashboard after completing it.

Last updated