What erasure encoding stretch (stripe) and replication configurations are supported by the system?

Asked by muratkars Answered by muratkars July 17, 2025
0 views

Understanding the supported erasure coding configurations is crucial for planning MinIO deployments, as it affects storage efficiency, fault tolerance, and performance characteristics.

This is particularly important for:

  • Capacity planning and storage efficiency
  • Fault tolerance requirements
  • Performance optimization
  • Hardware procurement decisions
  • Object size optimization

Answer

MinIO supports a wide range of erasure coding configurations with the following constraints:

Supported Configurations

General Rules:

  • Any K+M combination where 2 ≤ K+M ≤ 16
  • Parity constraint: 0 ≤ M < K
  • K = data drives, M = parity drives

Common Production Configurations

Most Popular:

  • EC 12+4 (75% usable capacity)

    • 12 data drives + 4 parity drives = 16 total
    • Can survive up to 4 concurrent drive failures
    • Excellent storage efficiency
  • EC 5+3 (62.5% usable capacity)

    • 5 data drives + 3 parity drives = 8 total
    • Can survive up to 3 concurrent drive/node failures
    • Good balance of efficiency and fault tolerance

Configuration Guidelines

Avoid K=M configurations (not recommended):

  • Examples: 4+4, 6+6, 8+8
  • Risk: Losing M drives with surviving K=M drives can cause a split brain situation
  • MinIO prefers to avoid this scenario for data safety

Object Size Considerations

Small Objects:

  • Higher K+M configurations have more metadata overhead per object
  • Each object requires metadata across all K+M drives
  • Consider lower K+M for workloads with many small files

Large Objects:

  • Benefit from higher K+M configurations due to better parallelism
  • Metadata overhead becomes negligible relative to object size
  • Better performance with more drives involved in I/O operations

Performance Impact:

  • More drives (higher K+M) = better throughput for large objects
  • Fewer drives (lower K+M) = lower latency for small objects
  • Network bandwidth utilization scales with drive count

Examples of Valid Configurations

ConfigurationTotal DrivesEfficiencyMax FailuresBest For
EC 4+2666.7%2Small objects, limited hardware
EC 6+2875%2Mixed workloads
EC 8+31172.7%3Large objects, high availability
EC 8+41266.7%4Critical data, large objects
EC 12+41675%4Large objects, maximum performance

Selection Considerations

  1. Storage Efficiency: Higher K/M ratio = better efficiency
  2. Fault Tolerance: Higher M = more drive failures tolerated
  3. Performance: More drives generally = better performance for large objects
  4. Object Size Profile: Match configuration to typical object sizes
  5. Hardware Constraints: Must match available drive slots
  6. Network Topology: Consider rack/node distribution for node failures

Replication vs Erasure Coding

  • Replication: Simple 2x or 3x copies (50% or 33% efficiency)
  • Erasure Coding: Mathematical redundancy (much higher efficiency)
  • MinIO recommends erasure coding for production deployments
0