MariaDB Optimization Guide

Real-world performance tuning from scaling a MariaDB cluster to handle 2M+ queries/day. This isn't theory - it's what actually worked in production.

📊 Results

Before optimization:

Query time: 800ms average
Slow queries: 15-20% of total
Max concurrent connections: 50
Frequent deadlocks
Server load during peak: 8.5

After optimization:

Query time: 35ms average (96% improvement)
Slow queries: < 1%
Max concurrent connections: 500
Zero deadlocks (proper indexing + isolation levels)
Server load during peak: 2.1

Real impact: Scaled from 100 users to 300+ users on the same hardware.

🎯 What's Inside

Configuration Files

production-optimized.cnf - Production config for high-traffic apps
replication-master.cnf - Master server optimizations
replication-slave.cnf - Slave server tuning for read replicas
development.cnf - Local development setup

Benchmarking

sysbench-test.sh - Benchmark your database
query-profiling.sql - Find slow queries and bottlenecks
index-analyzer.sql - Identify missing indexes
performance-report.sh - Generate comprehensive performance report

Migration Scripts

add-indexes.sql - Indexes that made the biggest difference
table-partitioning.sql - Partition large tables for better performance
query-optimization.sql - Before/after query examples

🚀 Quick Start

1. Benchmark Current Performance

# Install sysbench
apt-get install sysbench

# Run benchmark
./benchmarks/sysbench-test.sh

# Results will show:
# - Read/write throughput
# - Query latency (avg, 95th percentile, max)
# - Deadlocks and errors

2. Analyze Your Queries

-- Enable slow query log
source benchmarks/query-profiling.sql;

-- Wait 24 hours, then check slow queries
SELECT * FROM mysql.slow_log 
WHERE query_time > 1 
ORDER BY query_time DESC 
LIMIT 10;

3. Apply Optimizations

# Backup current config
cp /etc/mysql/mariadb.conf.d/50-server.cnf /etc/mysql/mariadb.conf.d/50-server.cnf.backup

# Copy optimized config
cp configs/production-optimized.cnf /etc/mysql/mariadb.conf.d/99-custom.cnf

# Restart MariaDB
systemctl restart mariadb

# Check if it started successfully
systemctl status mariadb

4. Add Missing Indexes

-- Analyze your tables
source benchmarks/index-analyzer.sql;

-- Add recommended indexes
source migration-scripts/add-indexes.sql;

📁 Configuration Breakdown

production-optimized.cnf

Key settings that made the difference:

[mysqld]
# InnoDB Buffer Pool (most important setting)
# Set to 70-80% of available RAM for database-only servers
innodb_buffer_pool_size = 8G
innodb_buffer_pool_instances = 8

# Query cache (disabled in MariaDB 10.6+, use at connection layer instead)
query_cache_type = 0
query_cache_size = 0

# Connection settings
max_connections = 500
max_connect_errors = 1000000
thread_cache_size = 50
table_open_cache = 4000

# InnoDB optimizations
innodb_log_file_size = 512M
innodb_flush_log_at_trx_commit = 2  # Trade some durability for speed
innodb_flush_method = O_DIRECT
innodb_file_per_table = 1

# Temp tables
tmp_table_size = 64M
max_heap_table_size = 64M

Why these settings?

innodb_buffer_pool_size: This is THE most important setting. Before: 128MB (default). After: 8GB. Query time dropped from 800ms to 200ms just from this.
innodb_flush_log_at_trx_commit: Changed from 1 to 2. Reduces fsync calls. I lose max 1 second of transactions if server crashes, but gained 3x write speed.
max_connections: Increased from 151 to 500 after monitoring connection spikes during peak hours.
innodb_flush_method: O_DIRECT prevents double buffering (OS cache + InnoDB cache). Freed up 2GB RAM.

Replication Setup

Master configuration:

[mysqld]
server-id = 1
log_bin = /var/log/mysql/mysql-bin.log
binlog_format = ROW
expire_logs_days = 7
max_binlog_size = 100M

# Replication optimizations
sync_binlog = 0  # Performance over durability
innodb_flush_log_at_trx_commit = 2

Slave configuration:

[mysqld]
server-id = 2
read_only = 1
relay_log = /var/log/mysql/relay-bin
relay_log_recovery = 1

# Slave optimizations
slave_parallel_threads = 4  # Parallel replication
slave_parallel_mode = optimistic

🔍 Finding Performance Bottlenecks

Check Slow Queries

-- Enable slow query log
SET GLOBAL slow_query_log = 1;
SET GLOBAL long_query_time = 1;  -- Log queries > 1 second

-- After running for a while, analyze
SELECT 
    sql_text,
    COUNT(*) as execution_count,
    AVG(query_time) as avg_time,
    MAX(query_time) as max_time
FROM mysql.slow_log
GROUP BY sql_text
ORDER BY avg_time DESC
LIMIT 20;

Find Missing Indexes

-- Tables with full table scans
SELECT 
    object_schema,
    object_name,
    count_star as total_queries,
    count_read as full_scans
FROM performance_schema.table_io_waits_summary_by_table
WHERE count_read > 1000
ORDER BY count_read DESC;

-- Queries not using indexes
SELECT 
    digest_text,
    count_star,
    sum_rows_examined,
    sum_rows_sent
FROM performance_schema.events_statements_summary_by_digest
WHERE sum_rows_examined > sum_rows_sent * 100
ORDER BY sum_rows_examined DESC
LIMIT 20;

Monitor Buffer Pool Hit Ratio

Should be > 99% for good performance:

SHOW GLOBAL STATUS LIKE 'Innodb_buffer_pool_read%';

-- Calculate hit ratio:
-- ((Innodb_buffer_pool_read_requests - Innodb_buffer_pool_reads) / 
--   Innodb_buffer_pool_read_requests) * 100

💡 Indexing Strategy

Composite Index Rules

I learned this the hard way:

-- ❌ WRONG: Separate indexes
CREATE INDEX idx_user ON orders (user_id);
CREATE INDEX idx_status ON orders (status);

-- ✅ CORRECT: Composite index
CREATE INDEX idx_user_status ON orders (user_id, status);

-- This query benefits from composite index:
SELECT * FROM orders WHERE user_id = 123 AND status = 'pending';

-- Index order matters! Put most selective column first
-- Exception: If you always filter by user_id, put it first

Covering Indexes

Include frequently selected columns:

-- Instead of:
CREATE INDEX idx_email ON users (email);

-- Do this if you always SELECT name with email lookup:
CREATE INDEX idx_email_name ON users (email, name);

-- Query is entirely satisfied by index (no table lookup needed)
SELECT name FROM users WHERE email = 'user@example.com';

Indexes I Added

These made the biggest difference:

-- Users table (40M+ rows)
CREATE INDEX idx_email ON users (email);
CREATE INDEX idx_created ON users (created_at);
CREATE INDEX idx_active_created ON users (is_active, created_at);

-- Tickets table (10M+ rows)
CREATE INDEX idx_user_status ON tickets (user_id, status);
CREATE INDEX idx_priority_created ON tickets (priority, created_at);
CREATE INDEX idx_assigned ON tickets (assigned_to, status);

-- Messages table (50M+ rows) - partitioned by date
CREATE INDEX idx_ticket ON messages (ticket_id, created_at);
CREATE INDEX idx_user ON messages (user_id, created_at);
ALTER TABLE messages ADD FULLTEXT idx_content (content);

🔧 Table Partitioning

For tables with time-series data:

-- Partition messages table by month
ALTER TABLE messages
PARTITION BY RANGE (YEAR(created_at) * 100 + MONTH(created_at)) (
    PARTITION p202401 VALUES LESS THAN (202402),
    PARTITION p202402 VALUES LESS THAN (202403),
    PARTITION p202403 VALUES LESS THAN (202404),
    -- Add partitions as needed
    PARTITION p_future VALUES LESS THAN MAXVALUE
);

-- Query performance improvement:
-- Before: 12s for date range query
-- After: 0.8s (only scans relevant partitions)

📈 Monitoring & Maintenance

Daily Health Check

-- Check table sizes
SELECT 
    table_schema,
    table_name,
    ROUND((data_length + index_length) / 1024 / 1024, 2) AS size_mb,
    table_rows
FROM information_schema.tables
WHERE table_schema NOT IN ('information_schema', 'performance_schema', 'mysql')
ORDER BY (data_length + index_length) DESC
LIMIT 20;

-- Check fragmentation
SELECT 
    table_name,
    ROUND(data_length / 1024 / 1024, 2) AS data_mb,
    ROUND(data_free / 1024 / 1024, 2) AS free_mb,
    ROUND((data_free / data_length) * 100, 2) AS fragmentation_pct
FROM information_schema.tables
WHERE table_schema = 'your_database'
AND data_free > 0
ORDER BY fragmentation_pct DESC;

-- Optimize fragmented tables
OPTIMIZE TABLE your_table;

Weekly Tasks

# Analyze tables for query optimizer
mysqlcheck --analyze --all-databases

# Check and repair tables
mysqlcheck --check --all-databases

# Update statistics
mysqlcheck --optimize --all-databases

Monitor Replication Lag

-- On slave server
SHOW SLAVE STATUS\G

-- Check these values:
-- Seconds_Behind_Master: Should be < 5
-- Slave_IO_Running: Should be Yes
-- Slave_SQL_Running: Should be Yes

🎓 Lessons Learned

What Worked

Buffer pool size is everything - Increased from 128MB to 8GB, instant 4x speedup
Composite indexes over multiple single indexes - Reduced query time by 10x
Partitioning time-series data - Made old data queries usable again
Connection pooling at app layer - Reduced connection overhead by 80%
Read replicas for reports - Offloaded heavy analytics from master

What Didn't Work

Query cache - Caused more problems than it solved, disabled it
Too many indexes - Slowed down writes, removed 30% of indexes with no impact on reads
Over-tuning - Spent days optimizing queries that ran 10 times/day
Premature sharding - Added complexity without benefits, went with replication instead

Mistakes I Made

Not backing up before config changes - Crashed production at 2 AM. Lesson learned.
Changing multiple settings at once - Couldn't tell what helped. Change one thing at a time.
Ignoring slow query log - Optimized the wrong queries. Always profile first.
Setting innodb_buffer_pool_size too high - Left no RAM for OS. Caused swapping. Bad.

🔐 Security Notes

Never disable binary logging in production - You'll need it for point-in-time recovery
Use separate replication user with minimal privileges
Enable SSL for replication traffic
Regular backups before making changes
Test in staging before applying to production

📚 Resources

MariaDB Performance Tuning
MySQL High Performance - Best book on the subject
Percona Monitoring - Free monitoring tool

🤝 Contributing

These configs worked for my workload (OLTP with 300 concurrent users, 2M+ queries/day). Your workload might be different. Always:

Benchmark before and after
Change one setting at a time
Test in staging first
Monitor for at least 24 hours after changes

📄 License

MIT - Use at your own risk. Always backup before making changes.

These optimizations went into production in 2024 on the Land of Promise Minecraft network. Still running strong today.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
benchmarks		benchmarks
configs		configs
migration-scripts		migration-scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

MariaDB Optimization Guide

📊 Results

🎯 What's Inside

Configuration Files

Benchmarking

Migration Scripts

🚀 Quick Start

1. Benchmark Current Performance

2. Analyze Your Queries

3. Apply Optimizations

4. Add Missing Indexes

📁 Configuration Breakdown

production-optimized.cnf

Replication Setup

🔍 Finding Performance Bottlenecks

Check Slow Queries

Find Missing Indexes

Monitor Buffer Pool Hit Ratio

💡 Indexing Strategy

Composite Index Rules

Covering Indexes

Indexes I Added

🔧 Table Partitioning

📈 Monitoring & Maintenance

Daily Health Check

Weekly Tasks

Monitor Replication Lag

🎓 Lessons Learned

What Worked

What Didn't Work

Mistakes I Made

🔐 Security Notes

📚 Resources

🤝 Contributing

📄 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages