Optimizing Log Storage Costs with Shell Scripting: A Practical DevOps Solution
Introduction
In modern DevOps environments, managing logs effectively is crucial. However, log management often comes with high costs, especially for organizations handling extensive logging from numerous microservices and infrastructure components. This blog explores how I tackled this challenge by implementing a cost-effective solution using shell scripting to manage Jenkins logs.
The client’s logging solution was based on a self-hosted ELK stack (Elasticsearch, Logstash, Kibana), which incurred significant storage costs. By leveraging a tailored shell script, I optimized log storage costs, seamlessly integrating Jenkins logs with AWS S3 and utilizing S3 lifecycle management for long-term cost savings.
Scenario
The Problem:
The client faced mounting expenses due to the volume of logs stored in the ELK stack. Here’s a breakdown of the challenges:
Massive Log Volume: Logs from 100+ microservices, Kubernetes control plane logs, and Jenkins CI/CD processes were stored in the ELK stack.
Redundant Storage: Jenkins logs were rarely analyzed but stored as a backup in ELK, contributing to unnecessary costs.
Scalability Concerns: The high costs threatened the scalability of the logging solution.
Solution
To address these challenges, I designed a solution to offload Jenkins logs from the ELK stack to AWS S3. By using shell scripting and leveraging S3's lifecycle management, the organization achieved significant cost savings. Here's how it worked:
Dynamic Log Transfer: The shell script identifies and uploads only the logs generated on the current day.
AWS Lifecycle Management: Logs are automatically transitioned to cheaper storage tiers (e.g., Glacier) based on retention policies.
Automation: A nightly cron job ensures the seamless execution of the log transfer process.
Error Notifications: Any errors or successful runs trigger email notifications to keep administrators informed.
Tech Stacks Used:
Jenkins: To generate CI/CD logs.
AWS S3: For cost-effective log storage with lifecycle management.
Shell Scripting: For automating the log transfer process.
AWS CLI: To enable seamless communication between the script and the S3 bucket.
Email Notifications: To alert administrators of success or errors.
Step-by-Step Guide
Step 1: Set Up the Prerequisites
1. Launch an EC2 Instance
Go to the AWS Management Console.
Launch a new EC2 instance for hosting Jenkins.
2. Install Jenkins
- Install Java:
sudo apt update
sudo apt install openjdk-17-jre
java -version
- Install Jenkins:
curl -fsSL https://pkg.jenkins.io/debian/jenkins.io-2023.key | sudo tee /usr/share/keyrings/jenkins-keyring.asc > /dev/null
echo deb [signed-by=/usr/share/keyrings/jenkins-keyring.asc] https://pkg.jenkins.io/debian binary/ | sudo tee /etc/apt/sources.list.d/jenkins.list > /dev/null
sudo apt-get update
sudo apt-get install jenkins
Note: By default, Jenkins will not be accessible to the external world due to the inbound traffic restriction by AWS. Open port 8080 in the inbound traffic rules as show below.
EC2 > Instances > Click on
In the bottom tabs -> Click on Security tab -> Click on Security groups.
Add inbound traffic rules as shown in the image (you can just allow TCP 8080 as well, in my case, I allowed All traffic).
Note: In real-time environment, please follow port security rules as per your organization policy.
3. Configure Email Notifications
- Install
mailutils
:
sudo apt install mailutils
- Configure Postfix:
sudo dpkg-reconfigure postfix
Select Internet Site.
Enter your System Mail Name (e.g., example.com).
Leave Other destinations to accept mail blank.
Set the rest of the options to default.
Test Email Configuration:
echo "This is a test email" | mail -s "Test Email" <your-email-address>
4. Create an S3 Bucket
Log in to the AWS Management Console and create a new S3 bucket.
Enable versioning and set up lifecycle rules to transition older logs to cheaper storage (e.g., Glacier).
5. Install AWS CLI
- Install and configure the AWS CLI:
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip awscliv2.zip
sudo ./aws/install
aws configure
Provide the AWS Access Key ID, Secret Access Key, and region during configuration.
Step 2: Develop the Shell Script
Below is the finalized script for transferring Jenkins logs to the S3 bucket, handling lifecycle management, and sending email notifications:
#!/bin/bash
# Metadata
# Author: Neha Avasekar
# Date: 12/01/2025
# Description: Script to upload Jenkins logs to S3, maintain metadata, handle lifecycle policies, and send email notifications.
# Version: 4.0
# Variables
JENKINS_HOME="/var/lib/jenkins" #Jenkins Home directory
S3_BUCKET="s3://<your-s3-bucket-name>" # Replace your S3 bucket name
LOG_FILE="/var/log/s3-log-upload.log"
UPLOADED_LOGS_META="/var/log/uploaded_logs_meta.txt"
ERROR_NOTIFICATION_EMAIL="<your-email-address" #Replace your email id
# Ensure metadata file exists
if [ ! -f "$UPLOADED_LOGS_META" ]; then
touch "$UPLOADED_LOGS_META"
fi
# Check if Jenkins is installed
if [ ! -d "$JENKINS_HOME" ]; then
echo "Jenkins is not installed. Please install Jenkins to proceed." | tee -a "$LOG_FILE"
exit 1
fi
# Check if AWS CLI is installed
if ! command -v aws &>/dev/null; then
echo "AWS CLI is not installed. Please install it to proceed." | tee -a "$LOG_FILE"
exit 1
fi
# Apply lifecycle policy only if not already applied
if ! aws s3api get-bucket-lifecycle-configuration --bucket "$(basename "$S3_BUCKET")" &>/dev/null; then
echo "Applying lifecycle management policy to S3 bucket..." | tee -a "$LOG_FILE"
aws s3api put-bucket-lifecycle-configuration --bucket "$(basename "$S3_BUCKET")" --lifecycle-configuration '{
"Rules": [
{
"ID": "TransitionToGlacier",
"Filter": {},
"Status": "Enabled",
"Transitions": [
{ "Days": 30, "StorageClass": "GLACIER" }
],
"Expiration": { "Days": 365 }
}
]
}'
echo "Lifecycle policy applied successfully." | tee -a "$LOG_FILE"
else
echo "Lifecycle policy already applied. Skipping reapplication." | tee -a "$LOG_FILE"
fi
# Process logs
uploaded_count=0
skipped_count=0
for job_dir in "$JENKINS_HOME/jobs/"*/; do
job_name=$(basename "$job_dir")
for build_dir in "$job_dir/builds/"*/; do
build_number=$(basename "$build_dir")
log_file="$build_dir/log"
if [ -f "$log_file" ]; then
log_identifier="$job_name-$build_number"
if grep -q "^$log_identifier$" "$UPLOADED_LOGS_META"; then
echo "Log $log_identifier already uploaded. Skipping." | tee -a "$LOG_FILE"
((skipped_count++))
else
s3_path="$S3_BUCKET/$job_name/$build_number.log"
aws s3 cp "$log_file" "$s3_path" --only-show-errors
if [ $? -eq 0 ]; then
echo "$log_identifier" >> "$UPLOADED_LOGS_META"
echo "Uploaded: $log_identifier to $s3_path" | tee -a "$LOG_FILE"
((uploaded_count++))
fi
fi
fi
done
done
echo "Summary: $uploaded_count logs uploaded, $skipped_count skipped."
Step 3: Automate with Cron
chmod +x /path/to/your-script.sh
crontab -e
Add:
0 0 * * * /path/to/your-script.sh
Outcome and Benefits
Cost Savings: Reduced ELK stack costs by approximately 50%.
Enhanced Efficiency: Automated log transfer minimized manual intervention.
Scalability: Seamlessly handled growing Jenkins jobs and logs.
Output and Usage
chmod +x /path/to/your-script.sh
crontab -e
0 0 * * * /path/to/your-script.sh
Output:
Conclusion
This project demonstrates the power of simple automation to optimize costs in DevOps environments. With minimal scripting, the solution achieved a scalable, efficient, and cost-effective log management process.
Take this solution and implement it in your projects to achieve operational excellence! 🚀
Github link: https://github.com/Mygithubneha/Shell-Scripting.git
Special thanks to Abhishek Veeramalla for providing the solution and resources, and to the YouTube video that served as a valuable reference for implementing the solution.