How do I change the FQDN or IP addresses of an AIStor cluster?

Changing the FQDN (Fully Qualified Domain Name) or IP addresses of an AIStor cluster is a significant operational change that requires careful planning and coordinated execution across all nodes. This guide covers the considerations, procedures, and best practices for network identity changes.

Answer

Critical Understanding

MinIO does not support live hostname/IP changes. The cluster identity is established at deployment time through the MINIO_VOLUMES configuration. Changing network identity requires:

Coordinated cluster-wide restart - All nodes must be updated simultaneously
Configuration consistency - All nodes must have identical endpoint configurations
DNS propagation - New hostnames must resolve correctly before restart
Certificate updates - TLS certificates must include new hostnames

Important: This is a planned maintenance operation. While no data migration occurs, the cluster will be temporarily unavailable during the transition.

When You Might Need This

Scenario	Bare Metal Approach	Kubernetes Approach
Datacenter migration	Full cluster restart with new FQDNs	Update DNS/Ingress, operator handles certs
IP address changes	Update DNS or endpoint config + restart	Update Service type or external LB
Domain rename	Update all endpoint hostnames + restart	Update `spec.certificates.config.dnsNames`
Load balancer changes	Update `MINIO_SERVER_URL` only	Update `spec.services.minio.serviceType`
Adding/removing nodes	Use pool expansion/decommissioning	Update pool spec (operator handles)

Note: Kubernetes deployments benefit from automatic DNS, certificate management, and rolling updates. Most FQDN changes only require CRD updates - the operator handles the rest.

Key Configuration Elements

MINIO_VOLUMES (Endpoint Definition)

The primary configuration that defines cluster topology:

# Example 4-node cluster
MINIO_VOLUMES="https://minio{1...4}.old-domain.com:9000/mnt/disk{1...4}"

# After FQDN change
MINIO_VOLUMES="https://minio{1...4}.new-domain.com:9000/mnt/disk{1...4}"

MINIO_SERVER_URL (Public Endpoint)

The externally-accessible URL for the cluster:

# Used by clients and for presigned URL generation
MINIO_SERVER_URL="https://s3.new-domain.com:9000"

DNS Cache Considerations

MinIO caches DNS lookups with configurable TTL:^[1]

Environment	Default TTL
Standard deployments	10 minutes
Kubernetes/orchestrated	30 seconds

# Override DNS cache TTL if needed
MINIO_DNS_CACHE_TTL=5m

Environment variable defined at^[2]

Pre-Change Checklist

Before making any changes, complete these verification steps:

Document current configuration
- Export current MINIO_VOLUMES from all nodes
- Record current DNS entries and IP addresses
- Backup TLS certificates and configuration files
Verify new DNS records
- Create new DNS A/AAAA records pointing to correct IPs
- Verify DNS propagation from multiple locations
- Test resolution: dig +short new-hostname.domain.com
Prepare TLS certificates
- Generate new certificates with updated SANs (Subject Alternative Names)
- Include both old and new hostnames during transition (optional)
- Verify certificate chain validity
Plan maintenance window
- Schedule downtime (cluster will be unavailable during restart)
- Notify dependent applications and users
- Prepare rollback script BEFORE starting (see Rollback Procedure section)
- Backup all config files and certificates on every node
Verify cluster health
- Ensure no drives or nodes are offline
- Complete any pending healing operations
- Check replication status (if using site replication)

Step-by-Step Procedure

Phase 1: Preparation (Before Maintenance Window)

1.1 Verify Current Cluster State

# Check cluster health
mc admin info mycluster

# Verify all drives online
mc admin info mycluster --json | jq '.info.servers[].drives[] | select(.state != "ok")'

# Check healing status
mc admin heal mycluster --json | jq '.summary'

# Analyze maintenance safety (identifies quorum constraints)
mc admin maintenance hosts mycluster/
# Creates dated folder with safe maintenance groups

Note: The mc admin maintenance hosts command analyzes your cluster topology and identifies which nodes share erasure sets. For FQDN/IP changes, all nodes must be restarted together, but this tool helps verify cluster health before maintenance.

1.2 Create New DNS Records

# Verify new DNS entries resolve correctly
for i in {1..4}; do
  echo "minio${i}.new-domain.com:"
  dig +short minio${i}.new-domain.com
done

1.3 Prepare New TLS Certificates

Certificates must include all hostnames in the SAN field:

# Example: Check SAN entries in certificate
openssl x509 -in public.crt -text -noout | grep -A1 "Subject Alternative Name"

Required SANs:

All node FQDNs (minio1.new-domain.com, minio2.new-domain.com, etc.)
Load balancer FQDN if applicable
Any IP addresses used for direct access

Phase 2: Configuration Update

2.1 Update Configuration on ALL Nodes

Update the environment file (typically /etc/default/minio or systemd override):

# On EACH node, update MINIO_VOLUMES
MINIO_VOLUMES="https://minio{1...4}.new-domain.com:9000/mnt/disk{1...4}"

# Update public endpoint if changed
MINIO_SERVER_URL="https://s3.new-domain.com:9000"

# Update domain for virtual-host style requests if used
MINIO_DOMAIN="new-domain.com"

2.2 Deploy New TLS Certificates

# Copy new certificates to each node
# Default location: ~/.minio/certs/ or /etc/minio/certs/
cp public.crt private.key /etc/minio/certs/

2.3 Verify Configuration Consistency

# Generate checksums of config files on each node
for node in minio{1..4}; do
  ssh $node "shasum /etc/default/minio"
done
# All checksums should match

Phase 3: Coordinated Restart

Important: For FQDN/IP changes, you cannot use rolling restarts or per-node cordon/uncordon because all nodes must have matching endpoint configurations. A full cluster restart is required.

3.1 Graceful Cluster Shutdown

# Option A: Verify peers first, then restart with graceful shutdown period
mc admin service restart mycluster --dry-run
# If all peers healthy, proceed with graceful restart

# Option B: Using systemctl on each node (parallel stop)
for node in minio{1..4}.old-domain.com; do
  ssh $node "sudo systemctl stop minio" &
done
wait

Note: mc admin service restart with --rolling flag is ideal for normal maintenance but cannot be used here because the new configuration (FQDN changes) must be applied to all nodes before any can start.

3.2 Verify All Nodes Stopped

for node in minio{1..4}.old-domain.com; do
  ssh $node "systemctl is-active minio"
done
# All should report "inactive"

3.3 Start All Nodes Simultaneously

# Start all nodes together (they must start with matching configs)
for node in minio{1..4}.new-domain.com; do
  ssh $node "sudo systemctl start minio" &
done
wait

3.4 Verify Cluster Formation

# Update mc alias with new endpoint
mc alias set mycluster https://minio1.new-domain.com:9000 ACCESS_KEY SECRET_KEY

# Verify cluster health
mc admin info mycluster

# Check all nodes are online
mc admin info mycluster --json | jq '.info.servers[].state'

Phase 4: Validation

4.1 Verify Cluster Operations

# Test read operation
mc ls mycluster/test-bucket

# Test write operation
echo "test" | mc pipe mycluster/test-bucket/test-file.txt

# Test delete operation
mc rm mycluster/test-bucket/test-file.txt

4.2 Verify Grid Communication

# Check peer connections in logs
journalctl -u minio | grep -i "grid" | tail -20

4.3 Update Client Configurations

Update all applications using the cluster:

S3 client endpoint URLs
Backup systems
Monitoring/alerting endpoints
Load balancer health checks

Risks and Failure Scenarios

Understanding what can go wrong helps you prepare for quick recovery:

Risk	Symptom	Cause	Prevention
DNS not propagated	Nodes can’t connect to peers	DNS TTL not expired, caching	Verify DNS from each node before restart
Certificate mismatch	TLS handshake failures in logs	New hostnames not in certificate SANs	Test certificates with `openssl s_client`
Config inconsistency	Bootstrap verification fails	Nodes have different `MINIO_VOLUMES`	Checksum configs across all nodes
Partial update	Cluster won’t form	Some nodes updated, others not	Update ALL nodes before starting ANY
Grid connection failure	”peer offline” errors	Firewall, DNS, or port issues	Test connectivity between all node pairs

Critical Window: The cluster is unavailable from when you stop all nodes until all nodes successfully restart with new configuration. Plan for 5-15 minutes depending on cluster size.

Rollback Procedure

Prepare rollback BEFORE making changes:

# 1. Backup current config on ALL nodes (run before any changes)
for node in minio{1..4}.example.com; do
  ssh $node "sudo cp /etc/default/minio /etc/default/minio.backup"
  ssh $node "sudo cp -r /etc/minio/certs /etc/minio/certs.backup"
done

# 2. Keep old DNS records active (do NOT delete until verified)
# Both old and new DNS should resolve during transition

# 3. Prepare rollback script (have this ready before starting)
cat > rollback.sh << 'EOF'
#!/bin/bash
NODES="minio1 minio2 minio3 minio4"
OLD_DOMAIN="old-domain.com"

for node in $NODES; do
  echo "Rolling back ${node}.${OLD_DOMAIN}..."
  ssh ${node}.${OLD_DOMAIN} "sudo systemctl stop minio" &
done
wait

for node in $NODES; do
  ssh ${node}.${OLD_DOMAIN} "sudo cp /etc/default/minio.backup /etc/default/minio"
  ssh ${node}.${OLD_DOMAIN} "sudo cp -r /etc/minio/certs.backup/* /etc/minio/certs/"
done

for node in $NODES; do
  ssh ${node}.${OLD_DOMAIN} "sudo systemctl start minio" &
done
wait

echo "Rollback complete. Verify cluster health:"
echo "mc admin info mycluster"
EOF
chmod +x rollback.sh

When to trigger rollback:

Cluster won’t form after 5 minutes - Nodes not finding each other
TLS errors in logs - Certificate problems
Bootstrap verification failures - Config mismatch
Partial node startup - Some nodes online, others failing

Rollback execution:

# If anything goes wrong, execute immediately:
./rollback.sh

# Verify cluster recovered
mc admin info mycluster

Important: Keep old DNS records active for at least 24-48 hours after successful migration. This allows rollback even after the maintenance window.

Special Considerations

Site Replication Deployments

If using site replication across multiple clusters:

Update one site at a time
Verify replication status after each site change
Update site replication peer endpoints:

mc admin replicate update mycluster --endpoint https://new-peer-endpoint:9000

Kubernetes Deployments (AIStor Operator)

Kubernetes deployments are significantly different - the operator handles most complexity automatically.

Internal DNS (Automatic - No Manual Changes Needed):

Pod FQDNs are managed by Kubernetes StatefulSet and follow this pattern:

{statefulset-name}-{ordinal}.{headless-service}.{namespace}.svc.{cluster-domain}

These are automatically managed by Kubernetes DNS - no manual configuration needed.

Adding Custom External Domains:

To add external FQDNs (for Ingress or external access), update the ObjectStore CRD:

apiVersion: aistor.min.io/v1
kind: ObjectStore
metadata:
  name: minio
spec:
  certificates:
    config:
      dnsNames:
        - minio.example.com           # Custom external domain
        - "*.minio.example.com"       # Wildcard for virtual-host style
        - "192.168.1.100"             # External IP if needed

The operator automatically:

Regenerates certificates with new SANs
Triggers a rolling restart of MinIO pods
Waits for quorum before proceeding to next pod

Changing Service Type:

spec:
  services:
    minio:
      serviceType: LoadBalancer  # ClusterIP, NodePort, or LoadBalancer
      name: custom-service-name  # Optional custom service name

Changing Cluster Domain:

If your Kubernetes cluster domain changes (rare):

# Update operator environment variable
kubectl set env deployment/aistor-operator CLUSTER_DOMAIN=custom.domain

What’s Immutable (By Design):

Pool topology (servers, volumes per server) - would break erasure coding
ObjectStore name - would break StatefulSet binding
PVC bindings - data protection

Rolling Update Timeout:

spec:
  rollingTimeoutSeconds: 600  # Default: 10 minutes per pod

Load Balancer Considerations

If using a load balancer in front of MinIO:

MINIO_SERVER_URL changes may only require updating this variable
Backend node changes still require full cluster restart
Health check endpoints remain the same (/minio/health/live)

Why Not Use Cordon/Uncordon for FQDN Changes?

For regular maintenance (OS updates, hardware replacement), MinIO provides graceful per-node maintenance:

# Regular maintenance workflow (NOT for FQDN changes)
mc admin cordon mycluster node1.example.com:9000   # Drains then cordons
# ... perform maintenance, restart node ...
mc admin uncordon mycluster node1.example.com:9000 # Restores to service

However, FQDN/IP changes cannot use this workflow because:

Bootstrap verification - All nodes verify they have matching endpoint configurations at startup
Grid connections - Peer-to-peer connections are established using configured hostnames
Partial updates fail - A node with new FQDNs cannot join a cluster still using old FQDNs

This is why FQDN changes require a full coordinated restart with all nodes updated simultaneously.

Post-Change Checklist

After completing the change, verify:

All nodes show as online in mc admin info
No drives reported as offline or healing
Read/write operations succeed
TLS certificates validate without errors
Monitoring systems receive metrics from new endpoints
Alerting endpoints updated and tested
Client applications successfully connecting
Replication status healthy (if applicable)
Old DNS records documented for removal (after validation period)

Summary

Step	Action	Verification
1. Prepare	DNS records, certificates, config files	DNS resolves, certs valid
2. Document	Backup current config, plan rollback	Rollback procedure tested
3. Stop	Gracefully stop all nodes	All nodes inactive
4. Update	Deploy new config and certs to all nodes	Config checksums match
5. Start	Start all nodes simultaneously	Cluster forms successfully
6. Validate	Test operations, update clients	Full functionality confirmed

Key Points:

This is a disruptive operation requiring planned downtime
All nodes must be updated together - partial updates will fail
DNS must propagate before starting nodes with new hostnames
Certificates must include new hostnames in SAN fields
No data migration occurs - only network identity changes

For complex network migrations or multi-site deployments, contact MinIO support through SUBNET for architecture review assistance.

Source Code References

cmd/common-main.go:373-377 - DNS cache TTL defaults: if orchestrated { dnsTTL = 30 * time.Second } else { dnsTTL = 10 * time.Minute }
cmd/server-main.go:150 - EnvVar: "MINIO_DNS_CACHE_TTL" (DNS cache TTL environment variable)