What does MinIO consider as a small object and how are small objects managed?

Asked by muratkars Answered by muratkars July 17, 2025
0 views

Understanding how MinIO handles small objects is critical for optimizing performance in workloads with many small files, such as IoT data, logs, or message queues.

This addresses key performance considerations:

  • Definition of small objects in MinIO
  • Special optimizations for small object storage
  • Performance implications for small object workloads
  • Why MinIO excels at small object performance

Answer

Small Object Definition

For all practical purposes, MinIO considers anything less than 1MiB as a small object.

However, there’s a special optimization threshold:

  • Objects < 128KiB receive special inline storage treatment
  • Objects ≥ 128KiB but < 1MiB are still considered small but stored normally

Special Optimization: Inline Storage

For objects smaller than 128KiB, MinIO implements a highly optimized storage strategy:

Inline Metadata Storage:

  • The object body is stored inside the metadata structure
  • No separate data file is created
  • Eliminates an extra disk seek operation

Same-Disk Locality:

  • Both metadata and data are kept on the same disk
  • Eliminates double-hop latency
  • Reduces network round trips in distributed setups

Architecture Advantages

Key Design Decisions:

  1. No standalone metadata database - metadata is distributed with data
  2. Inline layout for small objects - single read operation
  3. Locality optimization - minimizes disk seeks and network hops

Performance Impact

This optimization enables exceptional small object performance:

Real-World Benchmark:

  • > 400 kOps/s for 64 KiB GETs
  • Tested on a 16-node cluster
  • Demonstrates linear scalability

How It Works

Traditional Object Storage (2 operations):

1. Read metadata → get object location
2. Read object data from separate location
Result: 2 disk seeks, potential network hop

MinIO Small Object (1 operation):

1. Read metadata → object data included inline
Result: 1 disk seek, no network hop

Performance Comparison

Object SizeStorage MethodDisk SeeksLatency Impact
< 128KiBInline in metadata1Minimal
128KiB - 1MiBSeparate data file2Low
> 1MiBStandard erasure coding2+Standard

Benefits for Common Use Cases

1. IoT Data Ingestion:

  • Sensor readings often < 1KB
  • Millions of small messages
  • Benefits from inline storage

2. Log Aggregation:

  • Individual log entries < 128KiB
  • High write throughput required
  • Single-seek reads for analysis

3. Message Queue Backends:

  • Small message payloads
  • High operation rate requirements
  • Low latency critical

Best Practices for Small Objects

  1. Batch When Possible:

    • Although MinIO handles small objects well, batching can still improve efficiency
    • Consider aggregating very small objects (< 1KB) when feasible
  2. Monitor Metadata Usage:

    • Small objects increase metadata overhead
    • Plan storage capacity accordingly
    • Monitor metadata partition usage
  3. Optimize Erasure Coding:

    • Consider lower K+M values for small object workloads
    • Reduces metadata overhead per object
    • May improve latency
  4. Network Considerations:

    • Small objects may be network-bound rather than disk-bound
    • Ensure adequate network bandwidth
    • Consider network topology optimization

Why This Matters

Traditional Challenges with Small Objects:

  • High metadata overhead
  • Poor performance due to multiple seeks
  • Inefficient storage utilization

MinIO’s Solution:

  • Inline storage eliminates extra seeks
  • No separate metadata database bottleneck
  • Maintains high performance at scale

Performance Scaling

The inline storage optimization scales linearly:

  • 1 node: ~25 kOps/s
  • 16 nodes: > 400 kOps/s
  • Near-linear scaling for small object workloads

This makes MinIO particularly well-suited for modern workloads that generate large numbers of small objects, providing both the performance and scalability needed for demanding applications.

0