How to Integrate MinIO with Geyser Data for S3-Compatible Cold & Transitory Storage
- Aug 15
- 4 min read
Updated: Oct 13
Overview of Integration
When your MinIO cluster fills up with cold or infrequently accessed data, it’s time to think about tiering. By integrating MinIO with Geyser Data’s Tape-as-a-Service, you can move that data to a low-cost, S3-compatible archive tier. The best part? You don’t have to change how you work. Geyser can serve as a permanent cold archive or as a transitory tier for overflow and lifecycle management.
This guide walks you through the integration process step-by-step. Let’s get started on optimizing your storage immediately.
Architecture of the System
None
MinIO Cluster (Primary Hot Tier)
|
| Lifecycle Rule / ILM Tiering
v
Geyser Data (Cold/Transitory Tier - S3-compatible)
MinIO continues to serve hot data with high performance. Cold objects are migrated to Geyser Data using:
S3-compatible bucket replication
Lifecycle rules (automatic tiering based on age or prefix)
External scripting via MinIO SDKs (Go, Python, etc.)
Prerequisites for Integration
Before diving into the setup, ensure you have the following:
A running MinIO deployment (standalone or distributed mode)
Access to a Geyser Data account and credentials
mc (MinIO Client) installed on your admin workstation
Python or Go SDK (optional, for advanced control)
IAM user credentials from Geyser with access to a bucket (Access Key ID / Secret Access Key)
Step-by-Step Guide to Integration
Step 1: Set Up Geyser Data Bucket
Login to Geyser Console (or via API if you're automated):
Request credentials if not already provisioned.
Note the endpoint URL (e.g., https://la1.geyserdata.com).
Create a bucket for cold storage:
cold-archive-bucket
Ensure versioning is enabled (optional but recommended for replication).
Step 2: Add Geyser Data Endpoint to MinIO Client (mc)
```shell
mc alias set geyser https://la1.geyserdata.com <ACCESS_KEY_ID> <SECRET_ACCESS_KEY>
```
Example:
```shell
mc alias set geyser https://la1.geyserdata.comgeyseruser123 geysersecret456
```
Validate:
```shell
mc ls geyser
```
Step 3: Configure Lifecycle Rules on MinIO
Use MinIO’s lifecycle configuration feature to automatically transition data based on object age or prefix.
Example: Archive all objects older than 30 days to Geyser.
Create a lifecycle JSON config file (lifecycle.json):
```json
{
"Rules": [
{
"ID": "TransitionToGeyser",
"Status": "Enabled",
"Prefix": "",
"Expiration": {
"Days": 0
},
"Transitions": [
{
"Days": 30,
"StorageClass": "GLACIER"
}
]
}
]
}
```
Note: Since Geyser Data is not natively “GLACIER” but S3-compatible, MinIO lifecycle tiering must be coupled with scripting or replication. MinIO does not yet support true tiering to custom S3 endpoints, but you can simulate it via replication or scripting.
Step 4: Setup Bucket Replication (Simulating Archive Tiering)
Create the destination bucket in Geyser (Step 1).
Enable versioning on the source MinIO bucket:
```shell
mc version enable local/my-hot-bucket
```
Create a replication configuration JSON (replication.json):
```json
{
"Role": "arn:minio:replication::myminio:replication-role",
"Rules": [
{
"ID": "replicate-to-geyser",
"Status": "Enabled",
"Priority": 1,
"DeleteMarkerReplication": {
"Status": "Disabled"
},
"Destination": {
"Bucket": "arn:aws:s3:::cold-archive-bucket",
"Endpoint": "https://la1.geyserdata.com"
},
"Filter": {
"Prefix": ""
},
"DeleteReplication": "Disabled",
"SourceSelectionCriteria": {}
}
]
}
```
Apply the replication rule using mc:
```shell
mc replicate add local/my-hot-bucket --replicate "replication.json"
```
Step 5: Scripted Archive Using MinIO SDK (Optional)
For tighter control, use MinIO’s SDKs (Go or Python) to script archival logic.
Example (Python with boto3):
```python
import boto3
import os
from datetime import datetime, timedelta
MinIO client (source)
minio_session = boto3.session.Session()
minio_s3 = minio_session.client(
service_name='s3',
endpoint_url='http://minio.local:9000',
aws_access_key_id='minioadmin',
aws_secret_access_key='minioadmin'
)
Geyser client (destination)
geyser_session = boto3.session.Session()
geyser_s3 = geyser_session.client(
service_name='s3',
endpoint_url='https://la1.geyserdata.com',
aws_access_key_id='geyseruser123',
aws_secret_access_key='geysersecret456'
)
Move objects older than 30 days
cutoff = datetime.utcnow() - timedelta(days=30)
bucket_name = 'my-hot-bucket'
response = minio_s3.list_objects_v2(Bucket=bucket_name)
for obj in response.get('Contents', []):
last_modified = obj['LastModified']
if last_modified < cutoff:
key = obj['Key']
# Copy to Geyser
copy_source = {'Bucket': bucket_name, 'Key': key}
geyser_s3.copy_object(
Bucket='cold-archive-bucket',
CopySource=copy_source,
Key=key
)
# Delete from MinIO
minio_s3.delete_object(Bucket=bucket_name, Key=key)
```
Step 6: Validate and Monitor
Use `mc ls geyser/cold-archive-bucket` to confirm successful archival.
Monitor logs and object versions.
Optionally implement event notifications (MinIO supports webhook and AMQP triggers).
Advanced: Using Geyser as a Transitory Tier
For workflows where Geyser is used as a staging/overflow zone:
Write directly to Geyser via S3 client SDKs or backup tools.
Use metadata tagging (x-amz-meta-archive-reason) for traceability.
Retrieve when needed into MinIO using parallel `mc mirror` or boto3.
Example:
```shell
mc mirror --overwrite geyser/cold-archive-bucket local/my-hot-bucket
```
Or via Python:
```python
geyser_s3.download_file('cold-archive-bucket', 'object-key', 'downloads/object-key')
minio_s3.upload_file('downloads/object-key', 'my-hot-bucket', 'object-key')
```
Security & Access Control
Use IAM policies (on Geyser side) to restrict access by prefix, IP, or time.
Use TLS/SSL for all S3 traffic.
Enable object lock/versioning if using Geyser for compliance storage.
Performance Considerations
Geyser Data provides faster Time to First Byte than traditional archive solutions like Glacier or Deep Archive.
Use multipart uploads for large objects.
Geyser has no egress or retrieval fees, enabling aggressive lifecycle tiering without cost penalties.
Conclusion
This integration empowers MinIO users with cloud-like tiering to a cost-effective cold storage backend. You can achieve this without vendor lock-in or expensive retrieval penalties. Whether you're archiving old data or managing overflow in a transitory model, Geyser Data offers a highly compatible and economical solution.
Appendix: Useful Commands
```shell
Sync MinIO bucket to Geyser
mc mirror local/my-hot-bucket geyser/cold-archive-bucket
Restore archive to MinIO
mc mirror geyser/cold-archive-bucket local/my-hot-bucket
Set object retention (if using object lock)
mc retention set --default GOVERNANCE 365d geyser/cold-archive-bucket
```
---wix---



