MinIO: Integrating MinIO with Geyser Data for Cold Data Archive & Transitory Tier Storage
- Geyer Data
- 5 hours ago
- 3 min read

Overview
When your MinIO cluster starts filling up with cold or infrequently accessed data, it’s time to think tiering. By integrating MinIO with Geyser Data’s Tape-as-a-Service, you can move that data to a low-cost, S3-compatible archive tier, without changing how you work. Geyser can serve as a permanent cold archive or as a transitory tier for overflow and lifecycle management.
This guide walks through the integration process step-by-step so you can start optimizing storage immediately.
Architecture
None
MinIO Cluster (Primary Hot Tier)
|
| Lifecycle Rule / ILM Tiering
v
Geyser Data (Cold/Transitory Tier - S3-compatible)
MinIO continues to serve hot data with high performance, while cold objects are migrated to Geyser Data using:
S3-compatible bucket replication
Lifecycle rules (automatic tiering based on age or prefix)
External scripting via MinIO SDKs (Go, Python, etc.)
Prerequisites
A running MinIO deployment (standalone or distributed mode)
Access to a Geyser Data account and credentials
mc (MinIO Client) installed on your admin workstation
Python or Go SDK (optional, for advanced control)
IAM user credentials from Geyser with access to a bucket (Access Key ID / Secret Access Key)
Step-by-Step Guide
Step 1: Set Up Geyser Data Bucket
Login to Geyser Console (or via API if you're automated):
Request credentials if not already provisioned
Note the endpoint URL (e.g., https://la1.geyserdata.com)
Create a bucket for cold storage:
cold-archive-bucket
Ensure versioning is enabled (optional but recommended for replication)
Step 2: Add Geyser Data Endpoint to MinIO Client (mc)
Shell
mc alias set geyser https://la1.geyserdata.com <ACCESS_KEY_ID> <SECRET_ACCESS_KEY>
Example:
Shell
mc alias set geyser https://la1.geyserdata.comgeyseruser123 geysersecret456
Validate:
Shell
mc ls geyser
Step 3: Configure Lifecycle Rules on MinIO
Use MinIO’s lifecycle configuration feature to automatically transition data based on object age or prefix.
Example: Archive all objects older than 30 days to Geyser
Create a lifecycle JSON config file (lifecycle.json):
JSON
{
"Rules": [
{
"ID": "TransitionToGeyser",
"Status": "Enabled",
"Prefix": "",
"Expiration": {
"Days": 0
},
"Transitions": [
{
"Days": 30,
"StorageClass": "GLACIER"
}
]
}
]
}
Note: Since Geyser Data is not natively “GLACIER” but S3-compatible, MinIO lifecycle tiering must be coupled with scripting or replication. MinIO does not yet support true tiering to custom S3 endpoints — but you can simulate it via replication or scripting (next section).
Step 4: Setup Bucket Replication (Simulating Archive Tiering)
Create the destination bucket in Geyser (Step 1).
Enable versioning on source MinIO bucket:
Shell
mc version enable local/my-hot-bucket
Create a replication configuration JSON (replication.json):
JSON
{
"Role": "arn:minio:replication::myminio:replication-role",
"Rules": [
{
"ID": "replicate-to-geyser",
"Status": "Enabled",
"Priority": 1,
"DeleteMarkerReplication": {
"Status": "Disabled"
},
"Destination": {
"Bucket": "arn:aws:s3:::cold-archive-bucket",
"Endpoint": "https://la1.geyserdata.com"
},
"Filter": {
"Prefix": ""
},
"DeleteReplication": "Disabled",
"SourceSelectionCriteria": {}
}
]
}
Apply the replication rule using mc:
Shell
mc replicate add local/my-hot-bucket --replicate "replication.json"
Step 5: Scripted Archive Using MinIO SDK (Optional)
For tighter control, use MinIO’s SDKs (Go or Python) to script archival logic.
Example (Python with boto3):
Shell
import boto3
import os
# MinIO client (source)
minio_session = boto3.session.Session()
minio_s3 = minio_session.client(
service_name='s3',
endpoint_url='http://minio.local:9000',
aws_access_key_id='minioadmin',
aws_secret_access_key='minioadmin'
)
# Geyser client (destination)
geyser_session = boto3.session.Session()
geyser_s3 = geyser_session.client(
service_name='s3',
endpoint_url='https://la1.geyserdata.com',
aws_access_key_id='geyseruser123',
aws_secret_access_key='geysersecret456'
)
# Move objects older than 30 days
from datetime import datetime, timedelta
cutoff = datetime.utcnow() - timedelta(days=30)
bucket_name = 'my-hot-bucket'
response = minio_s3.list_objects_v2(Bucket=bucket_name)
for obj in response.get('Contents', []):
last_modified = obj['LastModified']
if last_modified < cutoff:
key = obj['Key']
# Copy to Geyser
copy_source = {'Bucket': bucket_name, 'Key': key}
geyser_s3.copy_object(
Bucket='cold-archive-bucket',
CopySource=copy_source,
Key=key
)
# Delete from MinIO
minio_s3.delete_object(Bucket=bucket_name, Key=key)
Step 6: Validate and Monitor
Use mc ls geyser/cold-archive-bucket to confirm successful archival
Monitor logs and object versions
Optionally implement event notifications (MinIO supports webhook and AMQP triggers)
Advanced: Using Geyser as a Transitory Tier
For workflows where Geyser is used as a staging/overflow zone:
Write directly to Geyser via S3 client SDKs or backup tools
Use metadata tagging (x-amz-meta-archive-reason) for traceability
Retrieve when needed into MinIO using parallel mc mirror or boto3
Example:
Shell
mc mirror --overwrite geyser/cold-archive-bucket local/my-hot-bucket
Or via Python:
Python
geyser_s3.download_file('cold-archive-bucket', 'object-key', 'downloads/object-key')
minio_s3.upload_file('downloads/object-key', 'my-hot-bucket', 'object-key')
Security & Access Control
Use IAM policies (on Geyser side) to restrict access by prefix, IP, or time
Use TLS/SSL for all S3 traffic
Enable object lock/versioning if using Geyser for compliance storage
Performance Considerations
Geyser Data provides faster Time to First Byte than traditional archive solutions like Glacier or Deep Archive
Use multipart uploads for large objects
Geyser has no egress or retrieval fees, enabling aggressive lifecycle tiering without cost penalties
Conclusion
This integration empowers MinIO users with cloud-like tiering to a cost-effective cold storage backend without vendor lock-in or expensive retrieval penalties. Whether you're archiving old data or managing overflow in a transitory model, Geyser Data offers a highly compatible and economical solution.
Appendix: Useful Commands
Shell
# Sync MinIO bucket to Geyser
mc mirror local/my-hot-bucket geyser/cold-archive-bucket
# Restore archive to MinIO
mc mirror geyser/cold-archive-bucket local/my-hot-bucket
# Set object retention (if using object lock)
mc retention set --default GOVERNANCE 365d geyser/cold-archive-bucket