How to Forward Logs to Elasticsearch

How to Forward Logs to Elasticsearch Log data is the silent witness to every system operation, application behavior, and security event within modern digital infrastructures. From web servers and databases to containerized microservices and cloud-native platforms, logs generate vast volumes of structured and unstructured information that, when properly collected and analyzed, become invaluable for

alex

Nov 10, 2025 - 12:03

How to Forward Logs to Elasticsearch

Log data is the silent witness to every system operation, application behavior, and security event within modern digital infrastructures. From web servers and databases to containerized microservices and cloud-native platforms, logs generate vast volumes of structured and unstructured information that, when properly collected and analyzed, become invaluable for troubleshooting, performance optimization, compliance, and threat detection. However, raw log files scattered across hundreds of servers are nearly impossible to manage manually. This is where Elasticsearch comes in.

Elasticsearch, part of the Elastic Stack (formerly known as the ELK Stack), is a powerful, distributed search and analytics engine designed to store, index, and retrieve massive datasets in near real time. When paired with log forwarders like Filebeat, Fluentd, or Logstash, Elasticsearch becomes the central nervous system of your observability strategy. Forwarding logs to Elasticsearch enables centralized logging, powerful querying, visual dashboards, and automated alerting transforming chaotic log streams into actionable intelligence.

This guide provides a comprehensive, step-by-step walkthrough on how to forward logs to Elasticsearch. Whether you're managing a small on-premises environment or a large-scale Kubernetes cluster, this tutorial covers the core concepts, practical configurations, industry best practices, essential tools, real-world examples, and answers to frequently asked questions all designed to help you implement a robust, scalable, and secure log forwarding pipeline.

Step-by-Step Guide

1. Understand the Log Forwarding Architecture

Before configuring any tool, its critical to understand the typical architecture of log forwarding to Elasticsearch. A standard pipeline consists of three components:

Log Source: Applications, servers, containers, or network devices generating logs (e.g., Apache access logs, systemd journal, Docker containers).
Log Forwarder/Collector: A lightweight agent that tails log files, reads from system streams, or receives logs via network protocols and ships them to Elasticsearch.
Elasticsearch Cluster: The centralized storage and indexing engine that receives, processes, and stores log data for search and analysis.

Often, a middle component called Logstash is inserted between the forwarder and Elasticsearch for parsing, filtering, and enriching logs. However, modern deployments increasingly favor lightweight forwarders like Filebeat or Fluentd that can send data directly to Elasticsearch, reducing complexity and resource overhead.

2. Install and Configure Elasticsearch

Before forwarding logs, ensure Elasticsearch is properly installed and accessible. You can deploy Elasticsearch on-premises, in a private cloud, or use a managed service like Elastic Cloud.

Option A: Self-Hosted Elasticsearch (Linux)

Download and install Elasticsearch from the official repository:

wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-8.12.0-linux-x86_64.tar.gz tar -xzf elasticsearch-8.12.0-linux-x86_64.tar.gz cd elasticsearch-8.12.0

Edit the configuration file config/elasticsearch.yml:

cluster.name: my-logging-cluster node.name: node-1 network.host: 0.0.0.0 http.port: 9200 discovery.type: single-node xpack.security.enabled: true xpack.security.transport.ssl.enabled: true

Generate certificates for secure communication:

bin/elasticsearch-certutil ca bin/elasticsearch-certutil cert --ca elastic-stack-ca.p12

Move the generated certificates to the config/certs directory and update elasticsearch.yml with:

xpack.security.transport.ssl.certificate: certs/node-1.crt xpack.security.transport.ssl.key: certs/node-1.key xpack.security.transport.ssl.certificate_authorities: [ "certs/ca.crt" ]

Start Elasticsearch:

bin/elasticsearch

Option B: Elastic Cloud (Managed)

If using Elastic Cloud, create a deployment via the web interface. Once deployed, note the following details from the deployment dashboard:

Elasticsearch endpoint (e.g., https://your-deployment-id.us-central1.gcp.cloud.es.io:9243)
Username and password (or API key)
CA certificate (download as PEM file)

3. Choose and Install a Log Forwarder

There are several tools to forward logs to Elasticsearch. The most popular are Filebeat, Fluentd, and Logstash. Each has strengths depending on your use case.

Filebeat Lightweight and Ideal for File-Based Logs

Filebeat is a lightweight, Go-based log shipper developed by Elastic. Its perfect for reading log files from disk and sending them directly to Elasticsearch or Logstash.

Install Filebeat on your log source server:

wget https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-8.12.0-linux-x86_64.tar.gz tar -xzf filebeat-8.12.0-linux-x86_64.tar.gz cd filebeat-8.12.0

Edit filebeat.yml:

filebeat.inputs: - type: filestream enabled: true paths: - /var/log/nginx/access.log - /var/log/nginx/error.log - /var/log/syslog output.elasticsearch: hosts: ["https://your-elasticsearch-host:9243"] username: "filebeat_system" password: "your-secure-password" ssl.certificate_authorities: ["/etc/filebeat/certs/ca.crt"] ssl.verification_mode: "full"

Copy the Elasticsearch CA certificate to /etc/filebeat/certs/ca.crt.

Test the configuration:

./filebeat test config ./filebeat test output

Start Filebeat:

sudo ./filebeat -e

For systemd-based systems, install Filebeat as a service:

sudo ./filebeat install sudo systemctl enable filebeat sudo systemctl start filebeat

Fluentd Flexible and Extensible for Complex Environments

Fluentd is a popular open-source data collector with a rich plugin ecosystem. Its ideal for environments requiring advanced parsing, filtering, and routing of logs from multiple sources (e.g., Docker, Kubernetes, systemd).

Install Fluentd via RubyGems or package manager:

curl -L https://toolbelt.treasuredata.com/sh/install-debian-bullseye-td-agent4.sh | sh

Configure /etc/td-agent/td-agent.conf:

<source> @type tail path /var/log/nginx/access.log pos_file /var/log/td-agent/nginx-access.log.pos tag nginx.access format nginx </source> <match **> @type elasticsearch host your-elasticsearch-host port 9243 scheme https ssl_verify false user filebeat_system password your-secure-password logstash_format true logstash_prefix nginx-logs flush_interval 10s </match>

Install the Elasticsearch plugin if needed:

td-agent-gem install fluent-plugin-elasticsearch

Restart Fluentd:

sudo systemctl restart td-agent

Logstash Advanced Processing Layer

Logstash is a server-side data processing pipeline that ingests logs from multiple sources, transforms them, and sends them to Elasticsearch. Its powerful but resource-intensive best used when complex parsing (e.g., grok patterns, geoip enrichment) is required.

Install Logstash:

wget https://artifacts.elastic.co/downloads/logstash/logstash-8.12.0-linux-x86_64.tar.gz tar -xzf logstash-8.12.0-linux-x86_64.tar.gz cd logstash-8.12.0

Create a configuration file at config/logstash.conf:

input {
file {
path => "/var/log/nginx/access.log"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
grok {
match => { "message" => "%{COMBINEDAPACHELOG}" }
}
geoip {
source => "clientip"
}
}
output {
elasticsearch {
hosts => ["https://your-elasticsearch-host:9243"]
user => "logstash_writer"
password => "your-secure-password"
ssl_certificate_verification => true
cacert => "/etc/logstash/certs/ca.crt"
index => "nginx-logs-%{+YYYY.MM.dd}"
}
}

Run Logstash:

bin/logstash -f config/logstash.conf

4. Create an Index Template in Elasticsearch

When logs are first indexed, Elasticsearch automatically creates an index. However, for consistent performance and querying, define an index template to control mapping, settings, and lifecycle policies.

Use the Elasticsearch API to create a template:

curl -X PUT "https://your-elasticsearch-host:9243/_index_template/nginx_logs_template" \
-H "Content-Type: application/json" \
-u "elastic:your-password" \
--cacert /etc/filebeat/certs/ca.crt \
-d '{
"index_patterns": ["nginx-logs-*"],
"template": {
"settings": {
"number_of_shards": 3,
"number_of_replicas": 1,
"refresh_interval": "5s"
},
"mappings": {
"properties": {
"timestamp": { "type": "date" },
"clientip": { "type": "ip" },
"bytes": { "type": "long" },
"method": { "type": "keyword" },
"url": { "type": "text", "analyzer": "standard" },
"user_agent": { "type": "text", "analyzer": "keyword" }
}
}
},
"priority": 500
}'

This ensures all future indices matching nginx-logs-* inherit the same structure, improving search efficiency and reducing mapping conflicts.

5. Verify Log Ingestion

Once the forwarder is running, verify logs are reaching Elasticsearch:

curl -X GET "https://your-elasticsearch-host:9243/_cat/indices?v" \ -u "elastic:your-password" \ --cacert /etc/filebeat/certs/ca.crt

You should see indices like nginx-logs-2024.06.15 with a non-zero document count.

To view sample logs:

curl -X GET "https://your-elasticsearch-host:9243/nginx-logs-*/_search?size=5" \ -u "elastic:your-password" \ --cacert /etc/filebeat/certs/ca.crt

If logs appear, your pipeline is working. If not, check the forwarder logs (/var/log/filebeat/filebeat or /var/log/td-agent/td-agent.log) for errors.

6. Secure the Pipeline

Never expose Elasticsearch to the public internet. Use the following security measures:

Enable TLS/SSL encryption between forwarders and Elasticsearch.
Use Elasticsearchs built-in role-based access control (RBAC) create dedicated users with minimal privileges (e.g., beats_writer role).
Use API keys instead of passwords where possible.
Restrict network access using firewalls or VPCs.
Regularly rotate certificates and credentials.

Best Practices

1. Use Lightweight Forwarders Where Possible

Filebeat and Fluentd consume far less memory and CPU than Logstash. Use them for edge servers, containers, or resource-constrained environments. Reserve Logstash for centralized processing hubs where complex transformations are needed.

2. Avoid Indexing Unnecessary Fields

Every field indexed increases storage and slows queries. Use the drop_fields processor in Filebeat or record_transformer in Fluentd to remove irrelevant data like internal server IDs or debug flags.

3. Implement Index Lifecycle Management (ILM)

Log data grows rapidly. Configure ILM policies to automatically roll over indices, delete old data, and move warm data to cheaper storage:

PUT _ilm/policy/nginx_policy
{
"policy": {
"phases": {
"hot": {
"actions": {
"rollover": {
"max_size": "50gb",
"max_age": "30d"
}
}
},
"warm": {
"min_age": "30d",
"actions": {
"allocate": {
"number_of_replicas": 0
}
}
},
"delete": {
"min_age": "365d",
"actions": {
"delete": {}
}
}
}
}
}

Apply the policy to your index template:

"index_patterns": ["nginx-logs-*"],
"settings": {
"index.lifecycle.name": "nginx_policy",
"index.lifecycle.rollover_alias": "nginx-logs"
}

4. Use Consistent Timestamps and Time Zones

Ensure all log sources use UTC or a consistent time zone. Elasticsearch stores timestamps in UTC. Mismatched time zones cause confusion in dashboards and alerting. Use the date filter in Logstash or timestamp processor in Filebeat to normalize timestamps.

5. Monitor Forwarder Health

Forwarders can fail silently. Monitor their status using:

Filebeats built-in metrics endpoint: http://localhost:5066
Fluentds fluentd-monitoring plugin
System-level monitoring (CPU, memory, disk I/O)

Integrate metrics into Grafana or Kibana for real-time dashboards.

6. Avoid Log Bombing

High-frequency applications (e.g., microservices logging every request) can overwhelm Elasticsearch. Use sampling, batching, or rate limiting:

Set bulk_max_size in Filebeat to 5MB
Use flush_interval to reduce frequency
Apply rate_limit in Fluentd plugins

7. Separate Log Types into Different Indices

Dont mix application logs, system logs, and security logs into one index. Use distinct index patterns (app-logs-*, syslog-*, audit-*) to improve search performance and enable granular retention policies.

Tools and Resources

Core Tools

Filebeat Official lightweight log shipper from Elastic. Ideal for file-based logs.
Fluentd Highly extensible data collector with 1000+ plugins. Great for Kubernetes and hybrid environments.
Logstash Full-featured pipeline for complex parsing and enrichment. Best for centralized processing.
Elasticsearch The indexing and search engine at the core of the pipeline.
Kibana Visualization and dashboarding tool for Elasticsearch. Essential for log analysis.
Vector Modern, high-performance log collector written in Rust. Emerging alternative to Filebeat and Fluentd.
Fluent Bit Lightweight version of Fluentd, designed for containers and edge devices.

Useful Resources

Elastic Documentation Comprehensive guides for all Elastic Stack components.
Fluentd Official Docs Plugin reference and configuration examples.
Filebeat GitHub Repo Source code, issues, and community contributions.
How to Choose the Right Elastic Stack Component Official comparison guide.
Fluent Bit GitHub Lightweight alternative for containerized environments.
Elastic Cloud Fully managed Elasticsearch and Kibana service.

Community and Support

Engage with active communities:

Elastic Discuss Forum
Fluentd Slack Channel
Stack Overflow (tag: elasticsearch, filebeat, fluentd)
GitHub Issues for tool-specific bugs

Real Examples

Example 1: Forwarding Nginx Access Logs to Elasticsearch

Scenario: You run a web application on Ubuntu with Nginx. You want to centralize access logs for traffic analysis and anomaly detection.

Steps:

Install Filebeat on the Nginx server.
Configure filebeat.yml to read /var/log/nginx/access.log.
Enable the Nginx module: sudo filebeat modules enable nginx
Apply default parsing: sudo filebeat setup
Start Filebeat and verify indices appear in Kibana.

Result: In Kibana, you can create a dashboard showing top clients, HTTP status codes, response times, and geographic distribution of traffic all from raw Nginx logs.

Example 2: Kubernetes Container Logs via Fluent Bit

Scenario: You run a Kubernetes cluster and need to collect logs from all pods.

Steps:

Deploy Fluent Bit as a DaemonSet using the official Helm chart:

helm repo add fluent https://fluent.github.io/helm-charts helm install fluent-bit fluent/fluent-bit

Fluent Bit automatically tails /var/log/containers/*.log from each node.
Configure output to send to Elasticsearch:

[OUTPUT] Name es Match * Host your-elasticsearch-host Port 9243 TLS On TLS.Verify Off Logstash_Format On Logstash_Prefix k8s-logs Replace_Dots On User filebeat_system Password your-password

Use Kibanas Kubernetes app to visualize pod logs, resource usage, and errors.

Result: You can search logs from any pod by name, namespace, or container ID, and correlate them with cluster events.

Example 3: Centralized Syslog Aggregation with Logstash

Scenario: You have 50 Linux servers sending syslog data. You want to parse and enrich them before storage.

Steps:

Configure rsyslog on all servers to forward to a central Logstash server on UDP port 5140.

*.* @central-logserver:5140

On the Logstash server, create an input for syslog:

input {
udp {
port => 5140
type => "syslog"
}
}

Use grok patterns to parse RFC3164 or RFC5424 syslog messages:

filter {
if [type] == "syslog" {
grok {
match => { "message" => "%{SYSLOGTIMESTAMP:timestamp} %{SYSLOGHOST:hostname} %{DATA:program}(?:\[%{POSINT:pid}\])?: %{GREEDYDATA:logmessage}" }
}
date {
match => [ "timestamp", "MMM  d HH:mm:ss", "MMM dd HH:mm:ss" ]
}
geoip {
source => "hostname"
target => "geoip"
}
}
}

Output to Elasticsearch with index naming by date.

Result: All syslog entries are parsed, enriched with geo-location data, and stored in structured fields enabling alerts for failed SSH logins or unusual root activity.

FAQs

Can I forward logs to Elasticsearch without installing agents on every server?

Yes, but with limitations. You can use syslog forwarding (UDP/TCP) or network-based collectors like Vector or Fluent Bit that pull logs remotely. However, for reliability, security, and detailed metadata, installing lightweight agents (Filebeat, Fluent Bit) on each host is strongly recommended.

How much disk space does Elasticsearch need for logs?

It depends on log volume and retention. As a rule of thumb: 10GB per day of 1000 logs/sec at 500 bytes each = ~43GB/day. With 30-day retention, expect ~1.3TB. Always provision 2030% extra for overhead and indexing.

Is it safe to send logs over the public internet?

No. Always encrypt logs in transit using TLS and restrict access via firewalls or private networks. Never expose Elasticsearch directly to the internet. Use VPNs, private endpoints, or Elastic Clouds secure connectivity options.

Can I forward logs from Windows servers?

Yes. Filebeat supports Windows Event Logs. Configure the winlogbeat module to collect Application, Security, and System logs. Use the same Elasticsearch output configuration as Linux.

Whats the difference between Filebeat and Logstash?

Filebeat is a lightweight log shipper designed to collect and forward logs with minimal overhead. Logstash is a full-featured pipeline that can parse, filter, transform, and enrich logs but requires more memory and CPU. Use Filebeat for edge collection; use Logstash for centralized processing.

How do I handle log rotation?

Filebeat and Fluentd automatically handle log rotation. They track file positions using sincedb or pos_file files. Ensure these files are stored on persistent storage and not deleted during container restarts.

Can I use Elasticsearch for real-time alerting on logs?

Yes. Use Kibanas Alerting and Watcher features to create rules based on log patterns (e.g., more than 10 500 errors in 5 minutes). Alerts can trigger email, Slack, or webhook notifications.

Do I need Kibana to use Elasticsearch for logs?

No Elasticsearch can be queried directly via API. However, Kibana provides intuitive dashboards, visualizations, and alerting tools that make log analysis practical and scalable. Its highly recommended for production use.

Conclusion

Forwarding logs to Elasticsearch is not just a technical task its a foundational practice for modern observability. By centralizing your log data, you transform scattered, unstructured text into a powerful resource for debugging, performance tuning, security monitoring, and business intelligence. The pipeline outlined in this guide from selecting the right forwarder to securing the transport and optimizing indexing provides a robust, scalable, and maintainable foundation for any environment.

Remember: the goal is not to collect more logs, but to collect the right logs, in the right format, at the right time. Prioritize security, consistency, and efficiency. Start small with one application or server and expand iteratively. Use index templates, lifecycle policies, and monitoring to keep your system healthy as it grows.

As your infrastructure scales whether into the cloud, containers, or serverless architectures a well-designed log forwarding pipeline will remain your most reliable source of truth. Invest time in building it correctly. The insights you gain will pay dividends in reduced downtime, faster incident resolution, and greater operational confidence.

Now that you understand how to forward logs to Elasticsearch, the next step is to integrate this pipeline into your CI/CD workflows, automate deployment with Terraform or Ansible, and connect it to your alerting systems. The power of observability is in your hands use it wisely.

alex

How to Forward Logs to Elasticsearch

How to Forward Logs to Elasticsearch

Step-by-Step Guide

1. Understand the Log Forwarding Architecture

2. Install and Configure Elasticsearch

3. Choose and Install a Log Forwarder

Filebeat Lightweight and Ideal for File-Based Logs

Fluentd Flexible and Extensible for Complex Environments

Logstash Advanced Processing Layer

4. Create an Index Template in Elasticsearch

5. Verify Log Ingestion

6. Secure the Pipeline

Best Practices

1. Use Lightweight Forwarders Where Possible

2. Avoid Indexing Unnecessary Fields

3. Implement Index Lifecycle Management (ILM)

4. Use Consistent Timestamps and Time Zones

5. Monitor Forwarder Health

6. Avoid Log Bombing

7. Separate Log Types into Different Indices

Tools and Resources

Core Tools

Useful Resources

Community and Support

Real Examples

Example 1: Forwarding Nginx Access Logs to Elasticsearch

Example 2: Kubernetes Container Logs via Fluent Bit

Example 3: Centralized Syslog Aggregation with Logstash

FAQs

Can I forward logs to Elasticsearch without installing agents on every server?

How much disk space does Elasticsearch need for logs?

Is it safe to send logs over the public internet?

Can I forward logs from Windows servers?

Whats the difference between Filebeat and Logstash?

How do I handle log rotation?

Can I use Elasticsearch for real-time alerting on logs?

Do I need Kibana to use Elasticsearch for logs?

Conclusion

Related Posts

Popular Posts

Recommended Posts

Popular Tags