Understanding how MinIO handles small objects is critical for optimizing performance in workloads with many small files, such as IoT data, logs, or message queues.
This addresses key performance considerations:
- Definition of small objects in MinIO
- Special optimizations for small object storage
- Performance implications for small object workloads
- Why MinIO excels at small object performance
Answer
Small Object Definition
For all practical purposes, MinIO considers anything less than 1MiB as a small object.
However, there’s a special optimization threshold:
- Objects < 128KiB receive special inline storage treatment
- Objects ≥ 128KiB but < 1MiB are still considered small but stored normally
Special Optimization: Inline Storage
For objects smaller than 128KiB, MinIO implements a highly optimized storage strategy:
Inline Metadata Storage:
- The object body is stored inside the metadata structure
- No separate data file is created
- Eliminates an extra disk seek operation
Same-Disk Locality:
- Both metadata and data are kept on the same disk
- Eliminates double-hop latency
- Reduces network round trips in distributed setups
Architecture Advantages
Key Design Decisions:
- No standalone metadata database - metadata is distributed with data
- Inline layout for small objects - single read operation
- Locality optimization - minimizes disk seeks and network hops
Performance Impact
This optimization enables exceptional small object performance:
Real-World Benchmark:
- > 400 kOps/s for 64 KiB GETs
- Tested on a 16-node cluster
- Demonstrates linear scalability
How It Works
Traditional Object Storage (2 operations):
1. Read metadata → get object location2. Read object data from separate locationResult: 2 disk seeks, potential network hopMinIO Small Object (1 operation):
1. Read metadata → object data included inlineResult: 1 disk seek, no network hopPerformance Comparison
| Object Size | Storage Method | Disk Seeks | Latency Impact |
|---|---|---|---|
| < 128KiB | Inline in metadata | 1 | Minimal |
| 128KiB - 1MiB | Separate data file | 2 | Low |
| > 1MiB | Standard erasure coding | 2+ | Standard |
Benefits for Common Use Cases
1. IoT Data Ingestion:
- Sensor readings often < 1KB
- Millions of small messages
- Benefits from inline storage
2. Log Aggregation:
- Individual log entries < 128KiB
- High write throughput required
- Single-seek reads for analysis
3. Message Queue Backends:
- Small message payloads
- High operation rate requirements
- Low latency critical
Best Practices for Small Objects
-
Batch When Possible:
- Although MinIO handles small objects well, batching can still improve efficiency
- Consider aggregating very small objects (< 1KB) when feasible
-
Monitor Metadata Usage:
- Small objects increase metadata overhead
- Plan storage capacity accordingly
- Monitor metadata partition usage
-
Optimize Erasure Coding:
- Consider lower K+M values for small object workloads
- Reduces metadata overhead per object
- May improve latency
-
Network Considerations:
- Small objects may be network-bound rather than disk-bound
- Ensure adequate network bandwidth
- Consider network topology optimization
Why This Matters
Traditional Challenges with Small Objects:
- High metadata overhead
- Poor performance due to multiple seeks
- Inefficient storage utilization
MinIO’s Solution:
- Inline storage eliminates extra seeks
- No separate metadata database bottleneck
- Maintains high performance at scale
Performance Scaling
The inline storage optimization scales linearly:
- 1 node: ~25 kOps/s
- 16 nodes: > 400 kOps/s
- Near-linear scaling for small object workloads
This makes MinIO particularly well-suited for modern workloads that generate large numbers of small objects, providing both the performance and scalability needed for demanding applications.