RadioAstronomy.io

Computational astronomy research investigating galaxy evolution, cosmic large-scale structure, and quasar physics using DESI and modern spectroscopic surveys

🔭 Mission

This organization produces research outputs in astronomy and data science, building analysis-ready datasets from large public sources. The methodology was validated through the Steam Dataset 2025 — a multi-modal gaming analytics ARD with strong engagement and downloads on both Kaggle and Zenodo — and is now being applied to DESI DR1 spectroscopic surveys.

Current work spans galaxy evolution in different cosmic environments, AGN feedback mechanisms, and ML-driven spectral analysis. The research runs on purpose-built infrastructure that enables reproducibility at scale, and the entire system functions as a skill multiplier across systems engineering, DevOps, security, and machine learning.

📦 Repositories

Repository	Domain	Description	Status
proxmox-astronomy-lab	Infrastructure	Platform documentation, VM inventory, network architecture	Production
desi-cosmic-void-galaxies	Research	ARD factory + environmental quenching in cosmic voids	Active
desi-quasar-outflows	Research	AGN outflow spectral fitting and Cloudy modeling	Planned
desi-qso-anomaly-detection	Research	ML anomaly detection for quasar spectra	Planned
rbh1-validation-reanalysis	Research	Independent reanalysis of RBH-1 hypervelocity SMBH candidate	Active
year-of-code-2026	Development	2026 project sandbox: AI, ML, agentic coding, cloud infrastructure	Active
.github	Meta	Organization profile and templates	—

🔬 Active Repositories

Proxmox Astronomy Lab

The infrastructure foundation for all research workloads. Documents the 7-node Proxmox cluster, VM inventory, network architecture, and automation patterns. This is the platform that enables reproducible, scalable research across all projects.

DESI Cosmic Void Galaxies

Analyzing galaxy populations within cosmic voids using DESI Data Release 1 to investigate environmental quenching mechanisms. This project serves as the Analysis-Ready Dataset (ARD) factory for the organization, joining 9 Value-Added Catalogs into enriched data products that feed downstream research.

DESI Quasar Outflows

Investigating AGN-driven outflows through semi-automated spectral fitting combined with Cloudy photoionization modeling. Developing automated pipelines to identify and characterize outflows in massive spectroscopic datasets.

DESI Anomalous Quasar Detection

ML-based anomaly detection across millions of quasar spectra. Implementing 1D convolutional variational autoencoders on Ray clusters to identify statistically unusual objects that may represent new physics or rare phenomena.

RBH-1 Validation Reanalysis

Independent validation and reanalysis of the RBH-1 hypervelocity SMBH candidate (van Dokkum et al. 2025) using Bayesian inference and GPU-accelerated computing.

Year of Code 2026

2026 project sandbox covering AI, ML, agentic coding, RAG systems, cloud infrastructure, and the occasional side project. A space for experimentation and skill development across the full technology stack.

📊 Data Assets

Our research consumes DESI Data Release 1 Value-Added Catalogs, materialized through PostgreSQL and distributed as Parquet files.

DESI DR1 Value-Added Catalogs

VAC	Purpose	Scale
FastSpecFit	Stellar continuum modeling, emission line fluxes	6.4M galaxies
PROVABGS	Bayesian SED fitting, stellar mass, SFH	BGS sample
DESIVAST	Void classifications (4 algorithms)	~10.7K voids
Gfinder	Group catalog, halo mass estimates	Group members
AGN/QSO	Systemic redshifts, BAL flags, spectral classification	1.4M QSOs
CIV Absorber	Intervening CIV absorption systems	Absorber catalog
MgII Absorber	Intervening MgII absorption systems	Absorber catalog
QMassIron	Black hole masses, bolometric luminosity	QSO subset
Stellar Mass/EmLine	CIGALE stellar masses, emission line properties	Full sample

Data Pipeline

PostgreSQL serves as the materialization engine where VAC joins and derived computations occur. Final ARD products are exported to Parquet for distribution and analysis. The pipeline currently manages ~32GB of catalog data in PostgreSQL and ~108GB of spectral tiles in Parquet format.

🏗️ Platform Architecture

Production research platform running on a 7-node Proxmox cluster built from small form factor enterprise workstations. The cluster provides dedicated database servers, GPU compute, and Kubernetes orchestration for containerized workloads.

Cluster Specifications

Resource	Value
Nodes	7
Total Cores	144
Total RAM	704 GB
Total NVMe	26 TB
Network Fabric	10G LACP per node
GPU	RTX A4000 16GB

Node Inventory

Node	CPU	Cores	RAM	Role
node01	i9-12900H	20	96 GB	Compute (K8s)
node02	i5-12600H	16	96 GB	Light compute + 6TB storage
node03	i9-12900H	20	96 GB	Compute (K8s)
node04	i9-12900H	20	96 GB	Compute (K8s)
node05	i5-12600H	16	96 GB	Light compute + 6TB storage
node06	i9-13900H	20	96 GB	Heavy compute (databases)
node07	AMD 5950X	32	128 GB	GPU compute

VM Inventory

Research workloads run on dedicated VMs with role-specific resource allocation.

VM	IP	vCPU	RAM	Purpose
radio-k8s01	10.25.20.4	12	48G	Kubernetes primary node
radio-k8s02	—	12	48G	Kubernetes worker
radio-k8s03	—	12	48G	Kubernetes worker
radio-gpu01	10.25.20.10	12	48G	GPU worker (A4000) + K8s
radio-pgsql01	10.25.20.8	8	32G	Research PostgreSQL (pgvector, PostGIS)
radio-pgsql02	10.25.20.16	4	16G	Application PostgreSQL
radio-neo4j01	10.25.20.21	6	24G	Graph database
radio-fs02	10.25.20.15	4	6G	SMB file server (spectral data)
radio-agents01	10.25.20.20	8	32G	AI agents, monitoring stack

Architecture Diagram

Platform Capabilities

Hybrid Kubernetes + VM Architecture: RKE2 orchestration with strategic static VMs for databases and persistent services
Enterprise Security Baseline: CIS Controls implementation with research workflow accommodations
Secure Remote Access: Entra ID hybrid identity with Cloudflare ZTNA
Open Source Toolchain: GitOps automation, container orchestration, scientific computing workflows

🤝 OSS Program Support

This organization benefits from open source programs that provide tooling to qualifying public repositories.

Active Programs

Program	Provides	Use Case
CodeRabbit	AI code review (Pro tier)	PR review, CLI integration with agentic coding tools
Atlassian	Jira, Confluence (Standard)	Project tracking, documentation

Available for Future Use

Program	Provides	Planned Use
Snyk	Security scanning	Dependency vulnerability detection
SonarCloud	Code quality	Static analysis
Sentry	Error tracking	Runtime monitoring
Datadog	Observability	Metrics, logs, APM

🌟 Open Science Philosophy

We practice open science and open methodology — our version of "showing your work":

Research methodologies are fully documented and repeatable
Infrastructure configurations are version-controlled and automated
Scripts and pipelines are published so others can learn, adapt, or improve them
Learning processes are captured and shared for community benefit

Our hope is that these materials help someone facing similar challenges, or inspire collaboration that helps us. All projects operate under open source licenses (primarily MIT) to ensure maximum reproducibility.

📚 Documentation

Documentation Hub: docs.radioastronomy.io
GitHub Discussions: Technical discussions and collaboration
Issue Tracking: Project-specific development milestones

📄 License

Projects in this organization are licensed under MIT unless otherwise specified.

Computational astronomy research through open data, reproducible workflows, and enterprise infrastructure

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RadioAstronomy.io

RadioAstronomy.io

🔭 Mission

📦 Repositories

🔬 Active Repositories

Proxmox Astronomy Lab

DESI Cosmic Void Galaxies

DESI Quasar Outflows

DESI Anomalous Quasar Detection

RBH-1 Validation Reanalysis

Year of Code 2026

📊 Data Assets

DESI DR1 Value-Added Catalogs

Data Pipeline

🏗️ Platform Architecture

Cluster Specifications

Node Inventory

VM Inventory

Architecture Diagram

Platform Capabilities

🤝 OSS Program Support

Active Programs

Available for Future Use

🌟 Open Science Philosophy

📚 Documentation

📄 License

Pinned Loading

Repositories

People

Top languages

Uh oh!

Most used topics

Uh oh!