A Go-based REST API for handling file uploads with Cloudinary integration and ML tag inference. Built with Gin framework.
(yes, I made the logo in like 30 seconds)
I used to have my Next.JS application (https://github.com/bpbrianpark/jaybird/) hooked up to the Cloudinary API directly, but I found that it was quite slow when uploading large amounts of large files.
Goroutines enable multiple file uploads simultaneously, and will be much faster from the user perspective.
This project was created as a way for me to practice Go and help upload the files without being blocked by the tagging system (which uses the Python ML inference service)
- 📤 File Upload: Accept image and video uploads via multipart form data
- ☁️ Cloudinary Integration: Automatic upload to Cloudinary with metadata
- 🏷️ Tag Management: Combine user-provided tags with ML-generated tags
- 🧠 ML Tag Inference: Integrate with Python ML service for automatic tag generation
- 📦 Smart File Handling: In-memory buffers for small files, temp files for large files (>50MB)
- ✅ File Validation: Validates file types (images: JPEG, PNG, GIF, WebP; videos: MP4, MPEG, QuickTime, etc.)
- 🔄 Graceful Degradation: Continues upload even if ML service is unavailable
- Go 1.23.0 or later
- Docker and Docker Compose (for running the full stack)
- Cloudinary account (free tier available)
- Python 3.10+ (if running inference service locally)
-
Install dependencies
go mod download
-
Set up environment variables
Create a
.envfile in the project root:PORT=8080 CLOUDINARY_CLOUD_NAME=your_cloud_name CLOUDINARY_API_KEY=your_api_key CLOUDINARY_API_SECRET=your_api_secret CORS_ALLOWED_URL=your_accepted_url TEMP_DIR=/tmp PYTHON_INFERENCE_URL=http://localhost:8000
(Although if you're going to deploy it, you need to set the inference URL tothe proper one on whatever platform. Personally, I used Railway)
Getting Cloudinary Credentials:
- Sign up at cloudinary.com
- Go to Dashboard → Settings
- Copy your Cloud Name, API Key, and API Secret
The easiest way to run both the Go Upload Service and the Python Inference Service:
docker-compose upThis will start both services:
- Go API on
http://localhost:8080 - Inference Service on
http://localhost:8000
Start the Go API:
go mod tidy
go run cmd/api/main.go Start the Inference Service:
cd ml/inference
pip install -r ../requirements-prod.txt
uvicorn predictor:app --host 0.0.0.0 --port 8000The server will start on the port specified in your .env file.
Health check endpoint.
Response:
{
"success": true
}Upload a file (image or video) with optional title and tags.
Request:
- Method:
POST - Content-Type:
multipart/form-data - Fields:
file(required): The image or video filetitle(optional): Title for the uploadtags(optional): Comma-separated tags or array of tags
Success Response (200):
{
"success": true,
"url": "https://res.cloudinary.com/.../image.jpg",
"public_id": "gallery/abc123",
"tags": ["nature", "outdoor", "Bald Eagle"]
}Error Responses:
-
400 Bad Request: Missing file or invalid file type{ "error": "No file provided" } -
500 Internal Server Error: Cloudinary upload failure or server error{ "error": "Failed to upload to Cloudinary: ..." }
The service intelligently handles files based on size:
- Small files (< 50MB): Read into memory buffer for faster processing
- Large files (≥ 50MB): Saved to temporary file to avoid memory issues
Temporary files are automatically cleaned up after processing.
The ML Tag Inference service is a separate Python microservice that automatically identifies and tags wildlife in your photos. Basically it makes it so that you don't have to label the photos yourself using the tagging system in the Jaybird repository. Think of it like having somebody that looks at your photo and says "Hey, that's a Bald Eagle!" and properly identifies the animal.
Here's the flow in simple terms:
- You upload a photo → The Go API receives it
- The Go API sends the image → It converts your image to base64 and sends it to the Python inference service
- The ML model analyzes it → A deep learning model (MobileNetV3) that I trained looks at the image and tries to identify what wildlife is in it
- Tags come back → If the model is confident enough (above 50% confidence), it returns a tag like "Bald Eagle" or "Hummingbird"
- Tags get combined → Your manual tags + ML-generated tags = final tag list
- Everything uploads to Cloudinary → The file and all tags get stored together
Architecture:
- Built with FastAPI (Python web framework)
- Uses PyTorch for deep learning inference
- Model: MobileNetV3 (a lightweight, efficient neural network)
- Trained on 14 wildlife classes: Bald Eagle, Barred Owl, Black Bear, Cedar Waxwing, Cooper Hawk, Coyote, Great Blue Heron, Hummingbird, Osprey, Red-Tailed Hawk, Ring-Necked Pheasant, Sandhill Crane, Short-Eared Owl, and White-Tailed Deer
How the model works:
- The image gets preprocessed (resized to 224x224, normalized)
- The neural network processes it through multiple layers
- It outputs probabilities for each of the 14 wildlife classes
- If the highest probability is above 50%, that tag gets added
- The service returns both the tag and confidence scores for all classes
CORS is configured to allow requests from CORS_ALLOWED_URL in .env. To modify allowed origins, update the environment variable.
See LICENSE file for details.
