top of page

MinIO: Integrating MinIO with Geyser Data for Cold Data Archive & Transitory Tier Storage

  • Writer: Geyer Data
    Geyer Data
  • 5 hours ago
  • 3 min read
How to Integrate MinIO with Geyser Data for S3-Compatible Cold & Transitory Storage

Overview

When your MinIO cluster starts filling up with cold or infrequently accessed data, it’s time to think tiering. By integrating MinIO with Geyser Data’s Tape-as-a-Service, you can move that data to a low-cost, S3-compatible archive tier, without changing how you work. Geyser can serve as a permanent cold archive or as a transitory tier for overflow and lifecycle management.


This guide walks through the integration process step-by-step so you can start optimizing storage immediately.


Architecture

None

MinIO Cluster (Primary Hot Tier)

         |

         | Lifecycle Rule / ILM Tiering

         v

Geyser Data (Cold/Transitory Tier - S3-compatible)


MinIO continues to serve hot data with high performance, while cold objects are migrated to Geyser Data using:

  • S3-compatible bucket replication

  • Lifecycle rules (automatic tiering based on age or prefix)

  • External scripting via MinIO SDKs (Go, Python, etc.)


Prerequisites

  • A running MinIO deployment (standalone or distributed mode)

  • Access to a Geyser Data account and credentials

  • mc (MinIO Client) installed on your admin workstation

  • Python or Go SDK (optional, for advanced control)

  • IAM user credentials from Geyser with access to a bucket (Access Key ID / Secret Access Key)


Step-by-Step Guide


Step 1: Set Up Geyser Data Bucket

  1. Login to Geyser Console (or via API if you're automated):

    • Request credentials if not already provisioned

    • Note the endpoint URL (e.g., https://la1.geyserdata.com)

    • Create a bucket for cold storage:

      • cold-archive-bucket

    • Ensure versioning is enabled (optional but recommended for replication)


Step 2: Add Geyser Data Endpoint to MinIO Client (mc)

Shell

mc alias set geyser https://la1.geyserdata.com <ACCESS_KEY_ID> <SECRET_ACCESS_KEY>


Example:

Shell

mc alias set geyser https://la1.geyserdata.comgeyseruser123 geysersecret456


Validate:

Shell

mc ls geyser


Step 3: Configure Lifecycle Rules on MinIO

Use MinIO’s lifecycle configuration feature to automatically transition data based on object age or prefix.


Example: Archive all objects older than 30 days to Geyser

  1. Create a lifecycle JSON config file (lifecycle.json):

JSON

{

  "Rules": [

    {

      "ID": "TransitionToGeyser",

      "Status": "Enabled",

      "Prefix": "",

      "Expiration": {

        "Days": 0

      },

      "Transitions": [

        {

          "Days": 30,

          "StorageClass": "GLACIER"

        }

      ]

    }

  ]

}


Note: Since Geyser Data is not natively “GLACIER” but S3-compatible, MinIO lifecycle tiering must be coupled with scripting or replication. MinIO does not yet support true tiering to custom S3 endpoints — but you can simulate it via replication or scripting (next section).


Step 4: Setup Bucket Replication (Simulating Archive Tiering)

  1. Create the destination bucket in Geyser (Step 1).

  2. Enable versioning on source MinIO bucket:


Shell

mc version enable local/my-hot-bucket


  1. Create a replication configuration JSON (replication.json):


JSON

{

  "Role": "arn:minio:replication::myminio:replication-role",

  "Rules": [

    {

      "ID": "replicate-to-geyser",

      "Status": "Enabled",

      "Priority": 1,

      "DeleteMarkerReplication": {

        "Status": "Disabled"

      },

      "Destination": {

        "Bucket": "arn:aws:s3:::cold-archive-bucket",

        "Endpoint": "https://la1.geyserdata.com"

      },

      "Filter": {

        "Prefix": ""

      },

      "DeleteReplication": "Disabled",

      "SourceSelectionCriteria": {}

    }

  ]

}


  1. Apply the replication rule using mc:


Shell

mc replicate add local/my-hot-bucket --replicate "replication.json"


Step 5: Scripted Archive Using MinIO SDK (Optional)

For tighter control, use MinIO’s SDKs (Go or Python) to script archival logic.


Example (Python with boto3):


Shell

import boto3

import os


# MinIO client (source)

minio_session = boto3.session.Session()

minio_s3 = minio_session.client(

    service_name='s3',

    endpoint_url='http://minio.local:9000',

    aws_access_key_id='minioadmin',

    aws_secret_access_key='minioadmin'

)


# Geyser client (destination)

geyser_session = boto3.session.Session()

geyser_s3 = geyser_session.client(

    service_name='s3',

    endpoint_url='https://la1.geyserdata.com',

    aws_access_key_id='geyseruser123',

    aws_secret_access_key='geysersecret456'

)


# Move objects older than 30 days

from datetime import datetime, timedelta


cutoff = datetime.utcnow() - timedelta(days=30)

bucket_name = 'my-hot-bucket'


response = minio_s3.list_objects_v2(Bucket=bucket_name)

for obj in response.get('Contents', []):

    last_modified = obj['LastModified']

    if last_modified < cutoff:

        key = obj['Key']

        # Copy to Geyser

        copy_source = {'Bucket': bucket_name, 'Key': key}

        geyser_s3.copy_object(

            Bucket='cold-archive-bucket',

            CopySource=copy_source,

            Key=key

        )

        # Delete from MinIO

        minio_s3.delete_object(Bucket=bucket_name, Key=key)


Step 6: Validate and Monitor

  • Use mc ls geyser/cold-archive-bucket to confirm successful archival

  • Monitor logs and object versions

  • Optionally implement event notifications (MinIO supports webhook and AMQP triggers)


Advanced: Using Geyser as a Transitory Tier

For workflows where Geyser is used as a staging/overflow zone:

  • Write directly to Geyser via S3 client SDKs or backup tools

  • Use metadata tagging (x-amz-meta-archive-reason) for traceability

  • Retrieve when needed into MinIO using parallel mc mirror or boto3


Example:

Shell

mc mirror --overwrite geyser/cold-archive-bucket local/my-hot-bucket


Or via Python:

Python

geyser_s3.download_file('cold-archive-bucket', 'object-key', 'downloads/object-key')

minio_s3.upload_file('downloads/object-key', 'my-hot-bucket', 'object-key')


Security & Access Control

  • Use IAM policies (on Geyser side) to restrict access by prefix, IP, or time

  • Use TLS/SSL for all S3 traffic

  • Enable object lock/versioning if using Geyser for compliance storage


Performance Considerations

  • Geyser Data provides faster Time to First Byte than traditional archive solutions like Glacier or Deep Archive

  • Use multipart uploads for large objects

  • Geyser has no egress or retrieval fees, enabling aggressive lifecycle tiering without cost penalties


Conclusion

This integration empowers MinIO users with cloud-like tiering to a cost-effective cold storage backend without vendor lock-in or expensive retrieval penalties. Whether you're archiving old data or managing overflow in a transitory model, Geyser Data offers a highly compatible and economical solution.


Appendix: Useful Commands

Shell

# Sync MinIO bucket to Geyser

mc mirror local/my-hot-bucket geyser/cold-archive-bucket


# Restore archive to MinIO

mc mirror geyser/cold-archive-bucket local/my-hot-bucket


# Set object retention (if using object lock)

mc retention set --default GOVERNANCE 365d geyser/cold-archive-bucket


bottom of page