Question
How do I optimize MinIO for handling large file uploads and downloads (multi-GB files)? What are the best practices for maximizing throughput and minimizing transfer times for large objects?
Answer
Optimizing MinIO for large file transfers requires configuration tuning at multiple levels: server, client, network, and application. Here’s a comprehensive guide to achieve maximum performance for large file operations.
1. Server-Side Optimization
MinIO Server Configuration
# /etc/minio/minio.conf - Optimized for large files
# Core settingsMINIO_ROOT_USER=minio-adminMINIO_ROOT_PASSWORD=SecurePassword123!
# Performance settingsMINIO_API_WORKERS=16 # Increase API workersMINIO_API_REQUESTS_MAX=20000 # Increase max concurrent requestsMINIO_API_REQUESTS_DEADLINE=10m # Longer deadline for large uploads
# Memory optimizationMINIO_API_MEMORY_LIMIT=32GB # Increase memory limitMINIO_SCANNER_SPEED=fastest # Fastest scanning for metadata
# Compression (optional - may reduce throughput)MINIO_COMPRESS=off # Disable for max throughput# MINIO_COMPRESS_EXTENSIONS=".txt,.log,.csv" # Only for specific types
# Caching (for repeated downloads)MINIO_CACHE_DRIVES="/tmp/cache1,/tmp/cache2"MINIO_CACHE_QUOTA=80 # Use 80% of cache driveMINIO_CACHE_AFTER=0 # Cache immediatelyMINIO_CACHE_WATERMARK_LOW=70 # Low watermarkMINIO_CACHE_WATERMARK_HIGH=90 # High watermark
# Healing optimizationMINIO_HEAL_SCAN_MODE=deep # Comprehensive healingMINIO_HEAL_MAX_SLEEP=1s # Minimal heal delaySystemd Service Optimization
[Unit]Description=MinIO Object Storage ServerDocumentation=https://min.io/docs/minio/linux/index.htmlWants=network-online.targetAfter=network-online.targetAssertFileIsExecutable=/usr/local/bin/minio
[Service]WorkingDirectory=/usr/local/User=minio-userGroup=minio-user
# Resource limits for large filesLimitNOFILE=1048576 # Maximum file descriptorsLimitNPROC=1048576 # Maximum processesLimitCORE=infinity # Core dump sizeLimitMEMLOCK=infinity # Memory lock limit
# Memory settingsMemoryAccounting=trueMemoryHigh=60G # High memory thresholdMemoryMax=64G # Maximum memory usage
# CPU settingsCPUAccounting=trueCPUQuota=800% # 8 cores maximumCPUSchedulingPolicy=0 # Normal schedulingNice=-10 # Higher priority
# I/O settingsIOAccounting=trueIOSchedulingClass=1 # Real-time I/O classIOSchedulingPriority=4 # High I/O priorityBlockIOAccounting=true
# Network settingsIPAccounting=true
EnvironmentFile=-/etc/minio/minio.confExecStartPre=/bin/bash -c "if [ -z \"${MINIO_VOLUMES}\" ]; then echo \"Variable MINIO_VOLUMES not set\"; exit 1; fi"ExecStart=/usr/local/bin/minio server $MINIO_OPTS $MINIO_VOLUMES
Restart=alwaysTimeoutStopSec=infinitySendSIGKILL=no
[Install]WantedBy=multi-user.target2. Storage Infrastructure Optimization
Storage Configuration
# XFS filesystem optimization for large filesmkfs.xfs -f -i size=512 -d agcount=16 /dev/nvme0n1
# Mount options for performancemount -o noatime,largeio,inode64,swalloc /dev/nvme0n1 /opt/minio/data1
# Add to /etc/fstabecho "/dev/nvme0n1 /opt/minio/data1 xfs defaults,noatime,largeio,inode64,swalloc 0 2" >> /etc/fstab
# RAID optimization for multiple drivesmdadm --create --verbose /dev/md0 --level=0 --raid-devices=4 /dev/nvme0n1 /dev/nvme1n1 /dev/nvme2n1 /dev/nvme3n1I/O Scheduler Optimization
# Set I/O scheduler for NVMe drivesecho none > /sys/block/nvme0n1/queue/scheduler
# Optimize queue depth for high throughputecho 1024 > /sys/block/nvme0n1/queue/nr_requests
# Increase readahead for large sequential readsecho 4096 > /sys/block/nvme0n1/queue/read_ahead_kb
# Optimize for large I/O operationsecho 1 > /sys/block/nvme0n1/queue/nomerges3. Network Optimization
System Network Tuning
# /etc/sysctl.conf - Network optimization for large transfers
# TCP buffer sizesnet.core.rmem_max = 268435456net.core.wmem_max = 268435456net.core.rmem_default = 67108864net.core.wmem_default = 67108864net.ipv4.tcp_rmem = 4096 87380 268435456net.ipv4.tcp_wmem = 4096 65536 268435456
# TCP window scalingnet.ipv4.tcp_window_scaling = 1net.ipv4.tcp_timestamps = 1net.ipv4.tcp_sack = 1
# TCP congestion control (BBR for high bandwidth)net.core.default_qdisc = fqnet.ipv4.tcp_congestion_control = bbr
# Network interface buffersnet.core.netdev_max_backlog = 30000net.core.netdev_budget = 600
# Connection limitsnet.core.somaxconn = 65535net.ipv4.tcp_max_syn_backlog = 65535
# TCP optimizationsnet.ipv4.tcp_slow_start_after_idle = 0net.ipv4.tcp_no_metrics_save = 1net.ipv4.tcp_moderate_rcvbuf = 1
# Apply changessysctl -pNetwork Interface Optimization
# Increase network interface ring buffersethtool -G eth0 rx 4096 tx 4096
# Enable hardware offloadingethtool -K eth0 gso on gro on tso on
# Set interrupt coalescing for throughputethtool -C eth0 adaptive-rx on adaptive-tx on
# Multi-queue networkingecho 16 > /sys/class/net/eth0/queues/rx-0/rps_cpus4. Client-Side Optimization
MinIO Client (mc) Configuration
# Configure mc for large file transfersmc config host add myminio http://minio.example.com:9000 ACCESS_KEY SECRET_KEY
# Set parallel uploads for large filesmc config set myminio api-signature-version v4mc config set myminio multipart-threshold 128MBmc config set myminio multipart-copy-threshold 128MBmc config set myminio max-parts 10000
# Upload with optimized settingsmc cp --recursive --parallel 16 large-dataset/ myminio/bucket/AWS CLI Optimization
[default]region = us-east-1output = jsonmax_concurrent_requests = 20max_bandwidth = 1GB/smultipart_threshold = 128MBmultipart_chunksize = 64MBmax_queue_size = 10000
# Use AWS CLI for large transfersaws s3 cp large-file.bin s3://bucket/ \ --endpoint-url http://minio.example.com:9000 \ --cli-read-timeout 0 \ --cli-write-timeout 05. Application-Level Optimization
Optimized Upload Implementation (Go)
package main
import ( "context" "fmt" "io" "log" "os" "runtime" "sync" "time"
"github.com/minio/minio-go/v7" "github.com/minio/minio-go/v7/pkg/credentials")
type OptimizedUploader struct { client *minio.Client bucketName string workers int partSize int64 uploadQueue chan UploadTask wg sync.WaitGroup}
type UploadTask struct { filePath string objectName string fileSize int64}
func NewOptimizedUploader(endpoint, accessKey, secretKey, bucket string) *OptimizedUploader { client, err := minio.New(endpoint, &minio.Options{ Creds: credentials.NewStaticV4(accessKey, secretKey, ""), Secure: false, }) if err != nil { log.Fatal(err) }
return &OptimizedUploader{ client: client, bucketName: bucket, workers: runtime.NumCPU() * 4, // 4x CPU cores partSize: 128 * 1024 * 1024, // 128MB parts uploadQueue: make(chan UploadTask, 1000), }}
func (u *OptimizedUploader) UploadLargeFile(filePath, objectName string) error { file, err := os.Open(filePath) if err != nil { return err } defer file.Close()
stat, err := file.Stat() if err != nil { return err }
fileSize := stat.Size()
options := minio.PutObjectOptions{ PartSize: uint64(u.partSize), ContentType: "application/octet-stream", SendContentMd5: true, DisableContentSha256: true, // Disable for performance ConcurrentStreamParts: true, NumThreads: uint(u.workers), }
// Custom reader with buffer optimization bufferedReader := &BufferedReader{ reader: file, bufferSize: int(u.partSize), }
start := time.Now() _, err = u.client.PutObject( context.Background(), u.bucketName, objectName, bufferedReader, fileSize, options, )
if err != nil { return err }
duration := time.Since(start) throughput := float64(fileSize) / duration.Seconds() / (1024 * 1024) // MB/s
fmt.Printf("Upload completed: %s (%.2f MB/s)\n", objectName, throughput) return nil}
type BufferedReader struct { reader io.Reader buffer []byte bufferSize int}
func (br *BufferedReader) Read(p []byte) (n int, err error) { if br.buffer == nil { br.buffer = make([]byte, br.bufferSize) }
return br.reader.Read(p)}
// Concurrent upload managerfunc (u *OptimizedUploader) StartWorkers() { for i := 0; i < u.workers; i++ { go u.worker() }}
func (u *OptimizedUploader) worker() { for task := range u.uploadQueue { err := u.UploadLargeFile(task.filePath, task.objectName) if err != nil { log.Printf("Upload failed for %s: %v", task.filePath, err) } u.wg.Done() }}
func (u *OptimizedUploader) QueueUpload(filePath, objectName string) { stat, err := os.Stat(filePath) if err != nil { log.Printf("Failed to stat file %s: %v", filePath, err) return }
task := UploadTask{ filePath: filePath, objectName: objectName, fileSize: stat.Size(), }
u.wg.Add(1) u.uploadQueue <- task}
func (u *OptimizedUploader) WaitForCompletion() { u.wg.Wait() close(u.uploadQueue)}
// Usage examplefunc main() { uploader := NewOptimizedUploader( "minio.example.com:9000", "access-key", "secret-key", "large-files", )
uploader.StartWorkers()
// Queue multiple large files uploader.QueueUpload("/path/to/large-file-1.bin", "file-1.bin") uploader.QueueUpload("/path/to/large-file-2.bin", "file-2.bin") uploader.QueueUpload("/path/to/large-file-3.bin", "file-3.bin")
uploader.WaitForCompletion() fmt.Println("All uploads completed")}Python Implementation with Optimization
import osimport threadingimport queueimport timefrom concurrent.futures import ThreadPoolExecutor, as_completedfrom minio import Miniofrom minio.error import S3Error
class OptimizedMinIOUploader: def __init__(self, endpoint, access_key, secret_key, bucket_name): self.client = Minio( endpoint, access_key=access_key, secret_key=secret_key, secure=False ) self.bucket_name = bucket_name self.part_size = 128 * 1024 * 1024 # 128MB self.max_workers = os.cpu_count() * 4
def upload_large_file(self, file_path, object_name=None): """Upload a large file with optimized settings""" if object_name is None: object_name = os.path.basename(file_path)
file_size = os.path.getsize(file_path)
start_time = time.time()
try: with open(file_path, 'rb') as file_data: self.client.put_object( self.bucket_name, object_name, file_data, file_size, part_size=self.part_size, num_parallel_uploads=self.max_workers // 2, progress=self._progress_callback )
end_time = time.time() duration = end_time - start_time throughput = (file_size / (1024 * 1024)) / duration # MB/s
print(f"Upload completed: {object_name} ({throughput:.2f} MB/s)") return True
except S3Error as e: print(f"Upload failed for {object_name}: {e}") return False
def _progress_callback(self, bytes_uploaded): """Progress callback for monitoring uploads""" pass # Implement progress tracking if needed
def upload_multiple_files(self, file_list, max_concurrent=None): """Upload multiple files concurrently""" if max_concurrent is None: max_concurrent = min(self.max_workers, len(file_list))
results = []
with ThreadPoolExecutor(max_workers=max_concurrent) as executor: future_to_file = { executor.submit(self.upload_large_file, file_path): file_path for file_path in file_list }
for future in as_completed(future_to_file): file_path = future_to_file[future] try: result = future.result() results.append((file_path, result)) except Exception as e: print(f"Error uploading {file_path}: {e}") results.append((file_path, False))
return results
def resume_upload(self, file_path, object_name, upload_id): """Resume an interrupted multipart upload""" try: # List existing parts parts = self.client.list_parts(self.bucket_name, object_name, upload_id) uploaded_parts = [(part.part_number, part.etag) for part in parts]
# Continue upload from last part with open(file_path, 'rb') as file_data: result = self.client.put_object( self.bucket_name, object_name, file_data, os.path.getsize(file_path), part_size=self.part_size, sse=None, progress=self._progress_callback, metadata=None, tags=None, retention=None, legal_hold=False, part_number_start=len(uploaded_parts) + 1 )
return result
except S3Error as e: print(f"Resume upload failed: {e}") return None
# Usage exampleif __name__ == "__main__": uploader = OptimizedMinIOUploader( "minio.example.com:9000", "access-key", "secret-key", "large-files" )
# Upload single large file uploader.upload_large_file("/path/to/large-file.bin")
# Upload multiple files concurrently files = [ "/path/to/file1.bin", "/path/to/file2.bin", "/path/to/file3.bin" ] results = uploader.upload_multiple_files(files, max_concurrent=8)6. Load Balancer Optimization
HAProxy Configuration for Large Files
global daemon maxconn 40000
# Buffer optimization for large files tune.bufsize 65536 tune.maxrewrite 8192 tune.http.maxhdr 200
defaults mode http option httplog option dontlognull retries 3
# Timeout optimization for large transfers timeout connect 10s timeout client 3600s # 1 hour for large uploads timeout server 3600s # 1 hour for large uploads timeout http-request 300s # 5 minutes for request headers timeout http-keep-alive 10s
# Connection optimization option http-server-close option tcp-smart-accept option tcp-smart-connect
frontend minio_frontend bind *:9000
# Connection limits maxconn 10000
# Request size limits (disable for large files) # option http-buffer-request - Commented out for streaming
default_backend minio_backend
backend minio_backend balance roundrobin
# Health checks option httpchk GET /minio/health/live http-check expect status 200
# Server configuration server minio1 10.0.1.11:9000 check maxconn 2500 weight 100 server minio2 10.0.1.12:9000 check maxconn 2500 weight 100 server minio3 10.0.1.13:9000 check maxconn 2500 weight 100 server minio4 10.0.1.14:9000 check maxconn 2500 weight 100
# Connection pooling http-reuse aggressive7. Monitoring Large File Transfers
Performance Monitoring Script
#!/bin/bashMINIO_ALIAS="myminio"BUCKET="large-files"TEST_FILE="/tmp/test-large-file.bin"TEST_SIZE="1G"
echo "=== MinIO Large File Performance Test ==="echo "Timestamp: $(date)"echo
# Create test fileecho "Creating test file (${TEST_SIZE})..."dd if=/dev/zero of=$TEST_FILE bs=1M count=1024 status=progress 2>/dev/null
# Upload testecho "Testing upload performance..."UPLOAD_START=$(date +%s.%N)mc cp $TEST_FILE $MINIO_ALIAS/$BUCKET/test-upload.binUPLOAD_END=$(date +%s.%N)UPLOAD_TIME=$(echo "$UPLOAD_END - $UPLOAD_START" | bc)UPLOAD_SPEED=$(echo "scale=2; 1024 / $UPLOAD_TIME" | bc)
echo "Upload completed: ${UPLOAD_TIME}s (${UPLOAD_SPEED} MB/s)"
# Download testecho "Testing download performance..."DOWNLOAD_START=$(date +%s.%N)mc cp $MINIO_ALIAS/$BUCKET/test-upload.bin /tmp/test-download.binDOWNLOAD_END=$(date +%s.%N)DOWNLOAD_TIME=$(echo "$DOWNLOAD_END - $DOWNLOAD_START" | bc)DOWNLOAD_SPEED=$(echo "scale=2; 1024 / $DOWNLOAD_TIME" | bc)
echo "Download completed: ${DOWNLOAD_TIME}s (${DOWNLOAD_SPEED} MB/s)"
# Cleanuprm -f $TEST_FILE /tmp/test-download.binmc rm $MINIO_ALIAS/$BUCKET/test-upload.bin
# System metrics during testecho ""echo "=== System Metrics ==="echo "CPU Usage: $(top -bn1 | grep "Cpu(s)" | awk '{print $2}' | awk -F'%' '{print $1}')%"echo "Memory Usage: $(free | awk 'NR==2{printf "%.1f%%", $3*100/$2 }')"echo "Network Connections: $(netstat -an | grep :9000 | wc -l)"echo "Disk I/O: $(iostat -x 1 1 | tail -n +4 | awk '{sum+=$10} END {printf "%.1f%%", sum/NR}')"8. Performance Tuning Checklist
Server Configuration
- ✅ Increase API workers and request limits
- ✅ Optimize memory allocation
- ✅ Configure appropriate part sizes
- ✅ Disable compression for maximum throughput
- ✅ Use fast storage (NVMe SSDs)
Network Configuration
- ✅ Increase TCP buffer sizes
- ✅ Enable BBR congestion control
- ✅ Optimize network interface settings
- ✅ Use 10Gbps+ network connections
Client Configuration
- ✅ Use appropriate multipart thresholds
- ✅ Enable concurrent transfers
- ✅ Optimize part sizes for your network
- ✅ Use connection pooling
Storage Configuration
- ✅ Use XFS filesystem with optimized mount options
- ✅ Set I/O scheduler to ‘none’ for NVMe
- ✅ Increase read-ahead buffers
- ✅ Use RAID 0 for maximum throughput
9. Expected Performance Targets
| File Size | Network | Expected Throughput | Optimization Focus |
|---|---|---|---|
| 100MB-1GB | 1Gbps | 100-120 MB/s | Part size, concurrency |
| 1-10GB | 10Gbps | 800-1200 MB/s | Network buffers, I/O |
| 10GB+ | 10Gbps+ | 1-5 GB/s | Storage, parallelism |
10. Troubleshooting Performance Issues
Common Bottlenecks
- Small part sizes: Increase to 64-128MB for large files
- Network congestion: Monitor bandwidth utilization
- Storage latency: Check disk I/O metrics
- CPU limitations: Monitor CPU usage during transfers
- Memory pressure: Ensure adequate RAM for buffers
Performance Debugging Commands
# Monitor network throughputiftop -i eth0
# Monitor disk I/Oiotop -ao
# Monitor MinIO metricsmc admin prometheus metrics myminio
# Check multipart upload statusmc admin trace myminio --verbose --all
# Monitor system performancehtopvmstat 1iostat -x 1This comprehensive optimization guide will help you achieve maximum performance for large file transfers with MinIO, ensuring efficient utilization of your storage and network infrastructure.