A Python toolkit for Scientific Data Platform operations, providing utilities for MinIO object storage and SURFdrive file transfers.
- 🔐 Multi-account configuration support (WO, HO, ML, VIZ)
- 📁 Upload and download files
- 🪣 Bucket management and listing
- 🔍 List and verify uploaded objects
- ⚡ Environment-based credential management
- 📥 Download CSV files from SURFdrive public shares
- 🔒 HTTP Basic Authentication support
- 📊 Direct pandas DataFrame integration
- 🌐 WebDAV protocol support
pip install sdp-tools# Clone the repository
git clone https://github.com/cedanl/sdp-tools
cd sdp-tools
# Install with uv (recommended)
uv pip install -e ".[dev,test]"
# Or with pip
pip install -e ".[dev,test]"The MinIO module supports multiple accounts through environment variable naming patterns.
Set environment variables for your account (replace {ACCOUNT} with WO, HO, ML, or VIZ):
export MINIO_HO_ACCESS_KEY="your-access-key"
export MINIO_HO_SECRET_KEY="your-secret-key"
export MINIO_HO_ENDPOINT="https://minio.example.com"
export MINIO_HO_BUCKET="your-bucket-name"from minio_file import minio_file
# Initialize client for specific account
ho = minio_file("HO")
# Upload a file
ho.upload_file("local_file.txt", "remote/path/file.txt")
# Download a file
ho.download_file("local_download.txt", "remote/path/file.txt")
# List all files in bucket
ho.get_file_list()
# Get all available buckets
buckets = ho.get_buckets()
for bucket in buckets:
print(bucket.name)Download CSV files from SURFdrive public shares directly into pandas DataFrames.
export SURFDRIVE_SHARE_TOKEN="your-share-token"
export SURFDRIVE_PASSWORD="your-password"from surfdrive import download_surfdrive_csv
# Download CSV and get DataFrame
df = download_surfdrive_csv("data.csv")
if df is not None:
print(f"Downloaded {len(df)} rows")
print(df.head())
# Save locally if needed
df.to_csv("local_copy.csv", index=False)Or use the CLI:
# Set environment variables first
export SURFDRIVE_SHARE_TOKEN="your-token"
export SURFDRIVE_PASSWORD="your-password"
# Download and save CSV
python -m surfdrive.surfdrive_download output.csvEach account uses a prefix pattern: MINIO_{ACCOUNT}_*
| Variable Pattern | Required | Description | Accounts |
|---|---|---|---|
MINIO_{ACCOUNT}_ACCESS_KEY |
✅ | MinIO access key | WO, HO, ML, VIZ |
MINIO_{ACCOUNT}_SECRET_KEY |
✅ | MinIO secret key | WO, HO, ML, VIZ |
MINIO_{ACCOUNT}_ENDPOINT |
✅ | MinIO endpoint URL | WO, HO, ML, VIZ |
MINIO_{ACCOUNT}_BUCKET |
✅ | Target bucket name | WO, HO, ML, VIZ |
Example for HO account:
MINIO_HO_ACCESS_KEYMINIO_HO_SECRET_KEYMINIO_HO_ENDPOINTMINIO_HO_BUCKET
| Variable | Required | Description |
|---|---|---|
SURFDRIVE_SHARE_TOKEN |
✅ | Public share token from SURFdrive |
SURFDRIVE_PASSWORD |
✅ | Password for the public share |
# Install development dependencies
make install-dev
# Or manually
uv pip install -e ".[dev,test]"# Run all fast tests
make test-fast
# Run all tests including slow ones
make test
# Run with coverage
make test-coverage
# Run specific test file
pytest tests/test_surfdrive.py -v# Format code
make format
# Run linters
make lint
# Run all checks (lint + tests)
make check# Build package
make build
# Build and verify
make build-checksdp-tools/
├── src/
│ ├── minio_file/ # MinIO operations module
│ │ ├── __init__.py
│ │ └── minio_file.py
│ └── surfdrive/ # SURFdrive operations module
│ ├── __init__.py
│ └── surfdrive_download.py
├── tests/ # Test suite
│ ├── test_imports.py
│ ├── test_functionality.py
│ ├── test_surfdrive.py
│ └── test_and_build_distribution.py
├── docs/ # Documentation
├── pyproject.toml # Project configuration
├── Makefile # Development commands
└── README.md # This file
Initialize a MinIO client for a specific account.
Parameters:
account(str): Account identifier. Must be one of: "WO", "HO", "ML", "VIZ"
Methods:
get_buckets()→ list: Retrieve all available bucketsupload_file(file_name, full_name): Upload a file to MinIOfile_name: Local file pathfull_name: Remote object path
download_file(file_name, full_name): Download a file from MinIOfile_name: Local destination pathfull_name: Remote object path
get_file_list(): Print all objects in the bucket
Example:
from minio_file import minio_file
# Initialize for ML account
ml = minio_file("ML")
# Upload
ml.upload_file("data.csv", "datasets/data.csv")
# Download
ml.download_file("local_data.csv", "datasets/data.csv")Download a CSV file from SURFdrive public share.
Parameters:
filename(str): Name of the file to download (currently unused, downloads from configured share)
Returns:
pandas.DataFrameorNone: DataFrame with CSV data, or None if download fails
Environment Variables Required:
SURFDRIVE_SHARE_TOKENSURFDRIVE_PASSWORD
Example:
from surfdrive import download_surfdrive_csv
df = download_surfdrive_csv("data.csv")
if df is not None:
print(f"Shape: {df.shape}")
print(df.describe())Incorrect account {account}
Solution: Use only valid account names: "WO", "HO", "ML", or "VIZ"
Missing required environment variables
Solution: Set all required environment variables for your account (ACCESS_KEY, SECRET_KEY, ENDPOINT, BUCKET)
Failed to connect to MinIO
Solution:
- Verify the
MINIO_{ACCOUNT}_ENDPOINTis correct and accessible - Check network connectivity
- Ensure MinIO server is running
Error: 401
Solution:
- Verify
SURFDRIVE_SHARE_TOKENandSURFDRIVE_PASSWORDare correct - Check that the share is still active and accessible
Error: 404
Solution:
- Verify the share URL is correct
- Check that the file exists in the shared folder
Connection timeout
Solution:
- Check internet connectivity
- Verify SURFdrive service is accessible
- Try again later if service is experiencing issues
Contributions are welcome! Please see CONTRIBUTING.md for guidelines.
- Fork the repository
- Create a feature branch:
git checkout -b feature-name - Make your changes
- Run tests:
make test - Run linters:
make lint - Commit your changes:
git commit -m "Description" - Push to your fork:
git push origin feature-name - Create a Pull Request
MIT License - see LICENSE file for details.
Quick Start: 5-Minute Trusted Publisher Setup ⚡
For complete information on publishing to PyPI:
- Trusted Publisher Quick Start - Already uploaded? Start here!
- Complete Test PyPI Setup Guide - First time? Read this.
- Deployment Documentation - Overview and troubleshooting
- Issues: https://github.com/cedanl/sdp-tools/issues
- Documentation: https://github.com/cedanl/sdp-tools/tree/main/docs
- Deployment: https://github.com/cedanl/sdp-tools/tree/main/docs/deployment
Current version: 2025.1.9
See CHANGELOG.md for version history.