How does MinIO AIStor self-healing work internally?

Asked by muratkars Answered by muratkars January 4, 2026
0 views

Understanding MinIO AIStor’s self-healing process is essential for operators who need to ensure data integrity and manage recovery operations in distributed deployments.

Answer

MinIO implements background detection and repair of corrupted or missing data with persistent progress tracking. The self-healing system continuously monitors data integrity, detects various error conditions, and automatically reconstructs damaged or missing shards using erasure coding.


Detection

MinIO uses multiple detection mechanisms to identify data integrity issues.

Detection Functions

FunctionPurposeScope
Heal Objects on DiskEvaluates individual disk healthPer-disk scanning
Check Objects with All PartsValidates metadata + parts integrityFull object validation
Objects That Are DanglingIdentifies unrecoverable objectsOrphan cleanup
List Online DisksCompares modTime across disksVersion consistency

Detection Flow

┌─────────────────────────────────────────────────────────┐
│ Detection Layer │
├─────────────────────────────────────────────────────────┤
│ │
│ Scanner Process │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────┐ │
│ │ For each object in erasure set: │ │
│ │ │ │
│ │ 1. Read xl.meta from all disks │ │
│ │ 2. Compare modTime across disks │ │
│ │ 3. Validate metadata integrity │ │
│ │ 4. Check all parts exist │ │
│ │ 5. Verify checksums (deep mode) │ │
│ └─────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ Error Detected? ──► Queue for Healing │
│ │
└─────────────────────────────────────────────────────────┘

Error Types

The healing system detects and handles various error conditions.

Error Categories

Error TypeDescriptionDetection Method
File Not Found[1]Missing xl.meta metadataDisk read failure
File Version Not Found[1]Missing specific versionVersion lookup failure
File Corrupt[1]Corrupted xl.metaMetadata parsing failure
Part MissingMissing data partPart enumeration
Part CorruptCorrupted data partChecksum mismatch

Error Detection Matrix

┌─────────────────────────────────────────────────────────┐
│ Error Detection │
├─────────────────────────────────────────────────────────┤
│ │
│ xl.meta Check │
│ │ │
│ ├── Not Found ────────► File Not Found │
│ ├── Parse Error ──────► File Corrupt │
│ └── Version Missing ──► File Version Not Found │
│ │
│ Parts Check │
│ │ │
│ ├── Part Missing ─────► Part Missing │
│ └── Checksum Fail ────► Part Corrupt │
│ │
└─────────────────────────────────────────────────────────┘

Scan Modes

MinIO supports multiple scanning modes for different use cases.

ModeDescriptionUse CasePerformance Impact
Normal Mode[2]Regular metadata scanningContinuous backgroundLow
Deep Mode[2]Full checksum validationPeriodic integrity auditHigh
Uncommitted Scan[2]Fast dangling data detectionQuick cleanupMedium

Mode Comparison

Normal Mode (Default)
├── Reads xl.meta only
├── Compares versions across disks
├── Fast, low I/O impact
└── Detects: Missing files, version mismatches
Deep Mode (Thorough)
├── Reads xl.meta AND all parts
├── Validates every checksum
├── High I/O, CPU usage
└── Detects: Bitrot, silent corruption
Uncommitted Scan (Cleanup)
├── Scans for orphaned temp files
├── Identifies incomplete uploads
├── Medium I/O impact
└── Detects: Dangling data, failed writes

Healing Process

When issues are detected, MinIO executes a structured repair process.

Healing Steps

Step 1: Read xl.meta from all disks
Step 2: Determine read quorum
Step 3: Select latest metadata
Step 4: Check parts integrity
Step 5: Reconstruct missing shards
Step 6: Write to outdated disks
Step 7: Atomic commit to disks

Detailed Healing Flow

┌─────────────────────────────────────────────────────────┐
│ Healing Process │
├─────────────────────────────────────────────────────────┤
│ │
│ 1. READ METADATA │
│ └── Fetch xl.meta from all disks in erasure set │
│ │
│ 2. QUORUM CHECK │
│ └── Verify read quorum available │
│ └── If < quorum → Mark unrecoverable │
│ │
│ 3. SELECT LATEST │
│ └── Compare modTime across valid copies │
│ └── Choose most recent as authoritative │
│ │
│ 4. INTEGRITY CHECK │
│ └── Validate all parts against metadata │
│ └── Identify missing/corrupt parts │
│ │
│ 5. RECONSTRUCT │
│ └── Read available shards (data + parity) │
│ └── Reed-Solomon decode missing shards │
│ │
│ 6. WRITE REPAIRS │
│ └── Write reconstructed shards to affected disks │
│ └── Update xl.meta on repaired disks │
│ │
│ 7. ATOMIC COMMIT │
│ └── Rename temp files to final location │
│ └── Ensure all-or-nothing semantics │
│ │
└─────────────────────────────────────────────────────────┘

Reconstruction Requirements

ScenarioAvailable ShardsOutcome
Full healthAll N shardsNo healing needed
Degraded≥ D shards (read quorum)Reconstruction possible
Critical< D shardsUnrecoverable, logged as failed

Fresh Drive Healing

When a new or replacement drive is added, MinIO initiates a full drive healing process.

Progress Tracking Files[3]

FilePurposeLocation
.healing.binState file tracking progressDrive root
.healing.failed-list.json.zstCompressed list of failed objectsDrive root

Drive Healing Flow

New/Replacement Drive Detected
┌─────────────────────────────────────────────────────────┐
│ Initialize Healing │
│ └── Create .healing.bin state file │
└─────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────┐
│ Scan Erasure Set │
│ └── Enumerate all objects that should exist on drive │
└─────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────┐
│ For Each Object: │
│ ├── Check if shard exists on new drive │
│ ├── If missing → Reconstruct from peers │
│ ├── If corrupt → Replace with reconstructed │
│ └── Update progress in .healing.bin │
└─────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────┐
│ Retry Failed Objects │
│ └── Up to 5 attempts per object │
│ └── Failures logged to .healing.failed-list.json.zst │
└─────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────┐
│ Complete │
│ └── Remove .healing.bin (success) │
│ └── Keep failed list for investigation │
└─────────────────────────────────────────────────────────┘

Retry Configuration

ParameterValueDescription
Max Retries5 attemptsPer-object retry limit
Failed List FormatJSON (zstd compressed)Space-efficient storage
Progress PersistenceContinuousSurvives restarts

Background Healing Triggers

Healing can be triggered by multiple events:

TriggerDescriptionPriority
On-Read DetectionCorruption found during readHigh (immediate)
Scanner CyclePeriodic background scanNormal
Drive ReplacementNew drive added to poolHigh
Admin CommandManual healing requestConfigurable
MRF QueueFailed write retryNormal

Monitoring Healing

Key Metrics

MetricDescriptionAlert Threshold
Healing RateObjects healed per secondBelow expected rate
Failed ObjectsObjects that couldn’t be healed> 0
Queue DepthObjects pending healingGrowing continuously
Drive Healing ProgressPercentage completeStalled progress

Health Check Commands

Terminal window
# Check healing status
mc admin heal ALIAS --dry-run
# Start manual healing
mc admin heal ALIAS/bucket --recursive
# View healing progress
mc admin heal ALIAS --verbose

Best Practices

  1. Monitor failed lists: Investigate .healing.failed-list.json.zst for unrecoverable objects
  2. Schedule deep scans: Run periodic deep mode scans during low-traffic periods
  3. Replace failed drives promptly: Minimize time in degraded state
  4. Size for healing bandwidth: Ensure network can handle reconstruction I/O
  5. Track healing metrics: Alert on stalled or slow healing progress

Source Code References
  1. cmd/storage-errors.go:70,75,108 - Error definitions: errFileNotFound, errFileVersionNotFound, errFileCorrupt
  2. cmd/data-scanner.go:888-890 - Scan modes: HealNormalScan, HealDeepScan based on bitrot detection
  3. cmd/background-newdisks-heal-ops.go:42-43 - healingTrackerFilename = ".healing.bin", healingTrackerFailedList = ".healing.failed-list.json.zst"
0