This directory serves as a comprehensive exemplar for training SpatialTranscriptFormer on the HEST-1k benchmark dataset.
While the core SpatialTranscriptFormer framework is dataset-agnostic, this recipe provides a complete, out-of-the-box pipeline for reproducing our benchmarks, including data downloading, preprocessing, and specialized dataloaders.
dataset.py: ContainsHEST_DatasetandHEST_FeatureDataset, which subclassSpatialDatasetto handle the specific.h5adstructure and metadata conventions of the HEST dataset.io.py: Utilities for reading spatial graphs, coordinates, and.h5admatrices.utils.py: HEST-specific dataset setup routines, splitting logic, and vocabulary loading.download.py: Logic for fetching subsets of the gated HEST dataset from Hugging Face.
For complete CLI usage and training preset commands, refer to the main README.md and the Training Guide.