This package demonstrates a data processing workflow involving Bash script, Python conversion scripts, which automatically converts pre and post-process VASP/Quantum ESPRESSO data into machine learning interatomic potential (MLIP) training format (extxyz or npz).
AtomProNet
|
βββ Data collection from materials project database
β β
β βββ Atomic energy, position, lattice parameters
β βββ Supercell formation
|
|
βββ Data generation using DFT simulation (VASP/Quantum ESPRESSO)
β β
β βββ Batch job preparation
β βββ Batch job submission
β βββ Batch data collection
β
β
βββ Pre-processing for Neural Network (Post-processing of DFT simulation)
β β
β βββ DFT folders
β β
β βββ energy
β βββ forces
β βββ pressure
β βββ lattice parameters
β β
β βββ extxyz/npz format
β
β
βββ Post-processing
βββ Machine Learning Interatomic Potential (MLIP)
β β
β βββ Parity plots
β βββ Cumulative distributions
β
βββ Classical Molecular Dynamics (LAMMPS)
β
βββ Computational Performance Assesment
βββ Simulation cell size
βββ CPU allocation
This guide provides detailed instructions on how to install and use the AtomProNet package.
- Python 3.6 or later
- Pip (Python package manager)
- Bash Shell (e.g., Git Bash, Cygwin, or WSL on Windows) to execute .sh scripts.
-
Install Using Git:
- Open a command prompt or terminal.
- Navigate to the directory where you extracted the package.
- Install the package by running the command:
git clone https://github.com/MusannaGalib/AtomProNet.git cd AtomProNet pip install .
-
Install Using PyPI:
- AtomProNet can also be installed from PyPI:
pip install AtomProNet
- AtomProNet can also be installed from PyPI:
This command installs the package along with its dependencies.
Example datasets are given in 'example_dataset' folder. You can use the following commands to play with that by executing the python wrapper file.
cd AtomProNet
python3 process_and_run_script.py-
Bash Scripts (
.shfiles):- Takes a user-provided file path, process VASP and Quantum ESPRESSO job submission
- Takes a user-provided file path, runs over all VASP and Quantum ESPRESSO simulation folders
- Collect all the required information (energy, force, atomic positions, pressure in eV, lattice parameters)
-
Python Converter (
.pyfiles):- Processes the files generated by the Bash script.
- Outputs the converted npz and extxyz files.
- Post-process MLIP data to get parity plots and cumulative distributions.
To use this package, use the following options:
Choose an option:
1. Data from Materials Project
2. Pre-processing for DFT simulation
3. Pre-processing for Neural Network
4. Post-processingOption 1
Enter your choice (1/2/3/4 or 'exit'): 1
Enter your Materials Project API key (press Enter to use default):
Enter the material ID (e.g., mp-1234), compound formula (e.g., Al2O3), or elements (e.g., Li, O, Mn): Al2O3
Do you want to create supercells for all structures? (yes/no): yes
Enter the supercell size (e.g., 2 2 2): 2 3 4
Do you want to download energy+lattice data for the materials? (yes/no): yesOption 2
Enter your choice (1/2/3/4 or 'exit'): 2
Options:
1: VASP
Enter your choice: 1
VASP Options:
1: Prepare VASP job submission folders
1. Enter the full path to the folder containing multiple POSCAR files
2. Do you want to strain hydrostatically one POSCAR structure
Do you want to modify the EXX range in the script? (yes/no): yes
Enter the new range for EXX:
Start (e.g., -0.05):
Step size (e.g., 0.01):
End (e.g., 0.05):
3. Do you want to strain volumetrically one POSCAR structure
Do you want to modify the EXX, EYY, and EZZ ranges in the script? (yes/no): yes
Enter the new range for EXX, EYY, and EZZ:
Start (e.g., -0.05):
Step size (e.g., 0.01):
End (e.g., 0.05):
2: VASP job submission
3: Post-processing of VASP jobs
4: Convergence check of VASP jobs
q: Quit
2: Quantum ESPRESSO
Enter your choice: 2
Quantum ESPRESSO Options:
1: Prepare Quantum ESPRESSO job submission folders
2: Quantum ESPRESSO job submission
3: Post-processing of Quantum ESPRESSO jobs
q: Quit
q: QuitInstruction for preparing VASP jobs:
"INCAR", "KPOINTS", "vasp_jobsub.sh"files must be outside of the folder containing all thePOSCARfilesPOTCARfiles must be provided asPOTCAR_$atomsymbol(e.g.POTCAR_Al,POTCAR_O)
Instruction for preparing Quantum ESPRESSO jobs:
- The code will prepare
input_templateandqe_jobsub.shone level up of the providedPOSCARfiles - Update the
input_templateandqe_jobsub.shas needed - Pesudopotentials files must be provided as
$atomsymbol_*.UPF(e.g.li_pbe_v1.4.uspp.F.UPF,O.pbe-n-kjpaw_psl.0.1.UPF)
Option 3
Enter your choice (1/2/3/4 or 'exit'): 3
Do you want to run the first step (execute post-processing script)? (yes/no): yes
Select the system for post-processing:
1. VASP
Enter your choice (1/2): 1
Select the extraction type for VASP:
1. Extract ionic last step (Self-Consistent simulations)
Do you want to split the Data files? (yes/no):
2. Extract all ionic steps (Ab-initio MD)
Do you want to split the Data files? (yes/no):
2. Quantum ESPRESSO
Do you want to split the Data files? (yes/no):
Do you want to split the dataset into train, test, and validation sets? (yes/no): yesOption 4
Enter your choice (1/2/3/4 or 'exit'): 4
Post-Processing Options:
1. Post-Processing of MLIP
2. Post-Processing of LAMMPSπ Read More
Hydrostatically/Volumetrically strain a structure:
INCAR, KPOINTS, POTCAR, POSCAR, vasp_job.sh must be in the hydrostatic_strain.sh/volumetric_strain.sh folder
Max number of job submission:
job_submission.sh
βββ max_jobs=${1:-999} (Limit 999 job submission; change it based on server)2: VASP job submission:
last_job.txtkeeps track of how many jobs are submitted. While rerunning2: VASP job submission, it will uselast_job txtto continue submitting remaining jobs.job_submission.logkeeps track of how many jobs falied to resubmit later.
This Software is developed by Musanna Galib
If you use this software in your research, please cite the following paper:
BibTeX entry:
@misc{galib2025atompronetdataflowmachine,
title={AtomProNet: Data flow to and from machine learning interatomic potentials in materials science},
author={Musanna Galib and Mewael Isiet and Mauricio Ponga},
year={2025},
eprint={2501.14039},
archivePrefix={arXiv},
url={https://doi.org/10.48550/arXiv.2501.14039},
}If you have questions, please don't hesitate to reach out to galibubc[at]student[dot]ubc[dot]ca
If you find a bug or have a proposal for a feature, please post it in the Issues. If you have a question, topic, or issue that isn't obviously one of those, try our GitHub Disucssions.
If your post is related to the framework/package, please post in the issues/discussion on that repository.
