This work describes a thoughtful approach to creating a hybrid dataset for research into how validators are chosen and behave in Proof-of-Stake (PoS) blockchain networks. Although public data sources offer many validator statistics, they fall short by lacking clear behavioral labels, time-based organization, and examples of adversarial situations—all of which are critical for meaningful data analysis and the development of reliable algorithms. To address these gaps, we have designed a four-step process: first, we gather real-world data from PoS networks like Ethereum 2.0, Cosmos, and Polkadot; next, we generate synthetic data through simulations, carefully labeling various cooperative and adversarial behaviors; we then enhance the dataset with useful features, detailed behavioral notes, and interpretability scores such as trust or reliability; finally, we organize the data to support a wide range of research, including supervised learning and reinforcement learning. By combining real and simulated experiences, and making sure all aspects are clearly documented and organized, our dataset serves as a robust foundation for studying better, more trustworthy ways to select validators in decentralized systems. Above all, our goal is to provide a resource that supports transparent, secure, and fair decision-making in the future of blockchains.
byzantron-research/aibyz-dataset
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|