Skip to content

C++ force simulator program for atoms in a 3-D system.

License

Notifications You must be signed in to change notification settings

farismuhammad17/DryLab

C++ Build Status License

"In reality, there are only atoms and the void." — Democritus

DryLab++ is built on the belief that the vast complexity of chemistry can be broken down into simpler forces. While similar programs exist, most use static look-up tables for everything - VSEPR structures, bond angles, etc. We strive to build a system that is accurate, optimised, and computationally honest. As the sole developer for this project, I envision that I will only be finished the day I have learned all of chemistry itself; so, I can confidently say that this project will receive updates for the next 70 years.


Methane (CH4)

Water (H2O)

Table of contents

Roadmap

The following is a set of implementations that I have yet to implement.

Features

  1. Bond breaking: As of the current version, if you try: $H_2 + O_2 \rightarrow H_2O$, it won't work, and will form hydrogen peroxide. This problem arises because Oxygen does not realise that breaking its bond and forming a new bond with Hydrogen will maintain lesser energies, thus, as a result, Oxygen never splits to form water.

  2. Lennard-Jones (LJ) Potential: Currently, we just produce a repulsion force, but this will make it more accurate, if I can figure out how to put it without lagging everything out.

  3. Constant Calibration: Replacing the magic numbers with mathematically derived constants. This will resolve current balance issues where certain forces are disproportionately strong or weak, ensuring a more physically accurate equilibrium.

Optimizations

  1. Barnes-Hut algorithm: Let's us use way more particles, though the structure may have a conflict with future GPU implementation.

  2. Multithreading

  3. GPU Usage: As mentioned, may conflict with the Barned-Hut algorithm, but will make the code much more efficient with doing all the calculations.

Getting Started

Releases are available (refer installation here), and can be directly used, but if you wish to change certain constants or functionalities of the program, you would need to compile the program yourself. If you already have it set up, and want to just know how to use it in python, skip here.

1. Clone the repository

Navigate to some folder on your terminal, and run:

git clone https://github.com/maktabat/DryLab.git

2. Install dependencies

The Makefile comes with functions that help make this easier. Simply run:

make get_deps

And the dependancies, specifically pybind11 and the json.hpp from the nlohmann json library, if they aren't already in the project folder.

If there is some error, that pip or the C++ compiler isn't recognized, open up the Makefile, and manually input in your compiler or pip path. For example, if you are using g++, and your pip path is pip3, then:

CXX = g++
PIP = pip3

Still, if the file fails, or if you are curious what is being run, the Makefile only runs the following three lines:

pip install pybind11 --target . --break-system-packages

mkdir -p libs

curl -L https://github.com/nlohmann/json/releases/latest/download/json.hpp -o libs/json.hpp

Note that these files only get downloaded onto the project folder. This means that if the project folder is deleted, these downloaded files go with it. If you want to get rid of these downloads, the Makefile has the specific function for it:

make del_deps

3. Compilation

Now that all the dependancies are there, simply compile using:

make

If you make changes to the C++ files, you have to run this again, and only your changes will be compiled. If you want to delete the compiled files, run:

make clean

4. Python

Finally, the programming part - create a Python file, say ch4.py in the project folder. Import the module, and never forget to load the elements.json file. If you do not load it, the program will not throw any error, but will only return Nan and Inf for every value.

from drylab import *

ElemsDB.load('db/elements.json')

Atoms

Now you can create your atoms. The syntax is as follows:

Atom(
    atomic_number: int,
    position: Vector3,
    velocity: Vector3,
    charge: float = 0,
    velocityloss_pertick: float = 0
)

By default, both charge and velocityloss_pertick is set to $0$. Now, for example, if you were creating a single methane molecule, these would be your atoms:

vel_damp = 1e-6 # Amount of velocity lost per tick

d = 73 + 31 # Sum of their covalent radii, the distance the atoms want to stay at.

c  = Atom(
    6,
    Vector3(0, 0, 0),
    Vector3(0, 0, 0),
    velocityloss_pertick=vel_damp
)

h1 = Atom(
    1,
    Vector3(-d, 0, 10),
    Vector3(0, 0, 0),
    velocityloss_pertick=vel_damp
)

h2 = Atom(1,
    Vector3(d, 0, 10),
    Vector3(0, 0, 0),
    velocityloss_pertick=vel_damp
)

h3 = Atom(1,
    Vector3(0, d, -10),
    Vector3(0, 0, 0),
    velocityloss_pertick=vel_damp
)

h4 = Atom(1,
    Vector3(0, -d, -10),
    Vector3(0, 0, 0),
    velocityloss_pertick=vel_damp
)

Notice that we let the charges be $0$, this is not a problem, as later, when we bond these atoms, the charges get transferred properly. For now, they can be neutral atoms.

System

Then we would have to define the system based on this syntax:

System(
    atoms: list[Atom],
    dimensionX: float,
    dimensionY: float,
    dimensionZ: float,
    cellSize: float = -1
)

Dimension refers to the length of the universe in that specific direction. The cell size is the side length of the cubes of the grid that the atoms use to check for the force calculations. If another atom is too far of a cube from the current atom, it skips the calculation, due to forces being too weak.

If the value for the cell size is not given, it takes $-1$, which the C++ program interprets and calcualtes an appropriate cell size for, given the size of the universe and the number of intiial atoms. The exact functionality of this is mentioned later.

In our case, our system is as follows, and the system is also a cube in our example case here, of side length $1000$

system = System([c, h1, h2, h3, h4], 1000, 1000, 1000)

Now, in Methane, each hydrogen is bonded to the carbon, which is implemented as follows. This is also the point where the C++ program moves around the charges for the atoms.

system.makeBond(c, h1)
system.makeBond(c, h2)
system.makeBond(c, h3)
system.makeBond(c, h4)

Running the program

This program does not come with any feature to see the actual simulation. Later, we will discuss how to render the particles. For now, to run the system $dt$ moments in time. Simply run:

system.run()

Note that if you were to use this function in Python through pybind11, it would be very slow, because it has to actively keep switching from Python to C++. As a solution, a seperate function exists for this purpose, that runs run $N$ times.

system.run_steps(N)

Getting particle positions

positions = system.getPositions()

The program does not output a clean list of position vectors, rather, it returns a flat list of all the positions bunched together. For example, if you had two particle at $(X, Y, Z)$ and $(A, B, C)$, you would get:

[X, Y, Z, A, B, C]

It merely appends the value on, and this is for speed reasons, so it is upto the Python developer to figure out how to parse that.

It is similar for getting the position of bonds, which is useful when we need to render these later.

bondCoords = system.getBondCoords()

Lone pairs also exist for atoms that have two or more bonds, so you can get that atom's lone pair positions too.

# Assuming 'atom' is some earlier defined Atom

lonePairCoords = atom.getLPPositions()
numberOfLonePairs = atom.numLonePairs

Rendering

We can use anything for this: we have a set of coordinate to show, so you can even use matplotlib for this, but, for speed and just looking better, we'll use Pygame. Assuming you already have pygame (if not, refer the pygame documentation), the following will be the code to render the Methane molecule system from earlier.

from drylab import *

import math
import pygame

ElementDB.load("db/elements.json")

### --- The atoms defining, system, etc. --- ###

pygame.init()
screen = pygame.display.set_mode((800, 600))

# Or if you want it to be full screen:-
# screen = pygame.display.set_mode((0, 0), pygame.FULLSCREEN)

width, height = pygame.display.get_window_size()
clock = pygame.time.Clock()

Before we continue, pygame does not come with a system to render 3D particles, or anything 3D, so we have to make a function that takes a 3D position, and projects it to 2D. Obviously, you don't need to understand every component of it, just that it returns x and y (the 2D projected positions), and a factor which is just some multiplier you can use to get a radius to show, so that closer particles look bigger, and farther appear smaller.

def project(pos_x, pos_y, pos_z, zoom, width, height, ax, ay):
    # 1. Rotate around Y-axis (Left/Right)
    # x' = x*cos(ay) - z*sin(ay)
    # z' = x*sin(ay) + z*cos(ay)
    rad_y = ay
    tx = pos_x * math.cos(rad_y) - pos_z * math.sin(rad_y)
    tz = pos_x * math.sin(rad_y) + pos_z * math.cos(rad_y)

    # 2. Rotate around X-axis (Up/Down)
    # y' = y*cos(ax) - z'*sin(ax)
    # z'' = y*sin(ax) + z'*cos(ax)
    rad_x = ax
    ty = pos_y * math.cos(rad_x) - tz * math.sin(rad_x)
    tz = ty * math.sin(rad_x) + tz * math.cos(rad_x)

    view_distance = 800 # Increased for better 3D depth
    factor = view_distance / (view_distance + tz)

    x = int(tx * zoom * factor + width // 2)
    y = int(ty * zoom * factor + height // 2)

    return x, y, factor

We will now define some variables just for the pygame renderer, for all the 3D projection.

zoom = 0.5

running = True
draw_bonds = True # Set False if you don't want to see the bonds

angle_x = 0.0
angle_y = 0.0

rot_speed = 0.05
zoom_speed = 0.01

Following that, we implement some controls. Here, I have put WASD movement for rotating around the center, and E and Q to zoom in and out respectively.

while running:
    for event in pygame.event.get():
        if event.type == pygame.QUIT:
            running = False

    keys = pygame.key.get_pressed()

    if keys[pygame.K_a]: angle_y -= rot_speed
    if keys[pygame.K_d]: angle_y += rot_speed
    if keys[pygame.K_w]: angle_x -= rot_speed
    if keys[pygame.K_s]: angle_x += rot_speed
    if keys[pygame.K_e]: zoom    += zoom_speed
    if keys[pygame.K_q]: zoom    -= zoom_speed

Now we can just set up the screen to display everything on:

while running:
    ### --- Controls --- ###

    # Create black screen
    screen.fill((0, 0, 0))

    # 10000, so that the universe doesn't take years to do something
    system.run_steps(10000)

    pygame.display.set_caption(f"FPS: {clock.get_fps():.2f}")

To display the atoms, we get their positions, then display them based on that flat list we get.

while runnning:
    ### --- Controls, screen --- ###

    coords = system.getPositions()

    for i in range(0, len(coords), 3): # Step by 3!
        px, py, pz = coords[i], coords[i+1], coords[i+2]
        x, y, f = project(px, py, pz, zoom, width, height, angle_x, angle_y)

        # Draws a white circle in the projected position
        pygame.draw.circle(screen, (255, 255, 255), (x, y), int(5 * f))

Now, Methane does not have any lone pairs, but if you have something that does, you can display them too.

while running:
    ### --- Earlier stuff --- ####
    
    # Get the lone pair positions of carbon (c)
    lone_pairs = c.getLPPositions()

    for i in range(c.numLonePairs):
        ex, ey, ez = lone_pairs[i*3], lone_pairs[i*3+1], lone_pairs[i*3+2]
        x, y, f = project(ex, ey, ez, zoom, width, height, angle_x, angle_y)

        # Draws a red, slightly smaller circle
        pygame.draw.circle(screen, (255, 0, 0), (x, y), int(3 * f))

If you also wanted to see the bonds, you can continue adding on to the loop. This following chunk only executes if draw_bonds from earlier in the python file was set to True.

while running:
    ### --- Earlier stuff --- ####

    if draw_bonds:
        bond_coords = system.getBondCoords()

        for start, end in bond_coords:
            x1, y1, _ = project(start.x, start.y, start.z, zoom, width, height, angle_x, angle_y)
            x2, y2, _ = project(end.x, end.y, end.z, zoom, width, height, angle_x, angle_y)

            # Draws a grey line between the two atoms
            pygame.draw.aaline(screen, (150, 150, 150), (x1, y1), (x2, y2))

Finally, we can finish off the loop.

while running:
    ### --- Earlier stuff --- ####

    pygame.display.flip()
    clock.tick(60)

pygame.quit()

When you run this file, you should see a black screen that, eventually has the molecule form the tetrahedral structure of Methane. For reference, I have left this example of Methane, as well as for water, in the tests folder (note that if you run these after compiling yourself, you might have to move these files to the project folder, or change the import path appropriately).

Installation

Note

The following steps only download a compiled version of the code, and, thus, doesn't require the C++ compiler. If you wish to change the actual program in any way, you must download the source files and compile it. The steps are detailed earlier.

  1. Open the latest releases, and under assets, pick the appropriate .whl file. The number after cp represents the Python version you are using. For example: cp312 is for Python 3.12.x. The file is also named with the appropriate operating system, and even the CPU architecture, like arm64, x86, i686, etc. Depending on what you have in your system, select and download one. Remember this file for later.

  2. Go to the folder where you downloaded the .whl file in, and open the terminal.

  3. It is recommended for safety to create a virtual environment.

Operating system Command
Windows python -m venv drylab
MacOS / Linux python3 -m venv drylab
  1. Activate the virtual environment.
Operating system Command
Windows .\[path to the drylab folder]\Scripts\activate
MacOS / Linux source [path to the drylab folder]/bin/activate
  1. Now you must pip install that .whl file you selected earlier. It is the same way as normal, except that you mention the whole name of the .whl file.

  2. Now you can create a python file wherever and do whatever, but remember that you must have the virtual environment activated whenever you execute your programs.

Data source

The Mendeleev project provided the main chunk of data. Although we can use the library they have provided, we preferred to work with the raw json files, fortunately provided by them here. I chose to store them at a folder named db, under the name elements-o.json.

We do not need all that data though, so we decided to remove all the values we don't need, which we decided to do using some crude, simple python code.

import json

with open('db/elements-o.json') as file:
    data = json.load(file)

needed = ['atomic_number', 'atomic_weight', 'covalent_radius_cordero', 'en_pauling', 'vdw_radius']

for i, elem in enumerate(data):
    # List of only needed values
    cleaned={}

    for key in elem:
        if key in needed:
            if not elem[key]:
                elem[key] = approxValue(i, elem, key)

            cleaned[key] = elem[key]

        elif key == 'electronic_configuration':
            cleaned["valency"] = calculate_valency(elem['atomic_number'], elem['electronic_configuration'])
    output.append(cleaned)

with open('db/elements.json', 'w') as file:
    json.dump(output, file, indent=4)

Some of the values provided were null, so to interpret those values, we create a function approxValue, which uses formulas that we implement in the following sections.

Van der waal radius

For elements heavier than Nobelium (Z=102), exprimentally found data is unavaiable, and thus we had to use crude approximations to compute it. Using the following approximation:

$R_{vdw}(Z) = 2 \cdot R_{vdw}(Z-32) - R_{vdw}(Z-64)$

In the python program, this is the first part of that earlier function:

def approxValue(i, elem, key):
    if key == 'vdw_radius':
        elem_32 = data[i-32]
        elem_64 = data[i-64]

        return 2 * elem_32['vdw_radius'] - elem_64['vdw_radius']

This is just a Period-based Repetition, and $32$ and $64$ come from the fact that moving back that many elements puts you on the same column, but a row above (for 32) and two rows up (for 64). This approximation yields the following graph, where the blue dots are the data we got from the Mendeleev database, and the red are the ones computed by the formula.

Van der waal radius

Covalent radius

With the exact same logic we used for Van Der Waals radius, we approximate the covalent radius as follows:

$R_{c}(Z) = 2 \cdot R_{c}(Z-32) - R_{c}(Z-64)$

The implementation of this function is similar:

# --- approxValue ... ---
    elif key == 'covalent_radius_cordero':
        elem_32 = data[i-32]
        elem_64 = data[i-64]

        return 2 * elem_32['covalent_radius_cordero'] - elem_64['covalent_radius_cordero']

This would yield the following graph:

Van der waal radius

Electronegativity

Unfortunately, the earlier formula doesn't work for electronegativity because it often stops decreasing and starts increasing again, therefore the earlier idea fails entirely. Fortunately, this means we only look at the element right above and take its electronegativity instead. It is just an approximation, and it's good enough until someone finally measures the value for these elements.

# --- approxValue ... ---
    elif key == 'en_pauling':
        elem_32 = data[i-32]

        return elem_32['en_pauling']

This yields this graph:

Van der waal radius

There will still be null values for electronegativity, only for the noble gases though, who apparently have no electronegativity, and is just commonly set to $0$. This makes the whole function:

def approxValue(i, elem, key):
    if key == 'vdw_radius':
        elem_32 = data[i-32]
        elem_64 = data[i-64]

        return 2 * elem_32['vdw_radius'] - elem_64['vdw_radius']

    elif key == 'covalent_radius_cordero':
        elem_32 = data[i-32]
        elem_64 = data[i-64]

        return 2 * elem_32['covalent_radius_cordero'] - elem_64['covalent_radius_cordero']

    elif key == 'en_pauling':
        if elem['atomic_number'] in (2, 10, 18, 36, 54, 86): return 0

        elem_32 = data[i-32]
        return elem_32['en_pauling']

Valency

This will hold how many valence electrons the atom of a given element will be. It is computeed by this function that uses the electron configuration provided to count the value.

def calculate_valency(atomic_number, config_string):
    if atomic_number in [2, 10, 18, 36, 54, 86]: return 0

    # Matches patterns like '1s2', '2p6', '3d10', '4s2'
    matches = re.findall(r'(\d+)([spdf])(\d*)', config_string)

    if not matches: return 1 # Fallback for Hydrogen if config is just '1s'

    max_shell = max(int(m[0]) for m in matches)

    valence_count = 0
    for shell, orbital, count in matches:
        if int(shell) == max_shell:
            num = int(count) if count else 1
            valence_count += num

    return valence_count

Logic

The provided program written in C++ does not have any option to visualise the particles or actually see anything. It is merely a calculator that computes the positions of every atom in a given system using real world forces.

ElemsDB

In order to access the values we have stored in the json file, the ElemsDB file contains a function, load, that reads a given json file and parses the data using the nlohmann json library. Without this, the program would have to read the json file each time a new atom is instantiated. Therefore, for speed reasons, we have the data parsed and C++ ready beforehand.

Grid

In real life, every atom exerts a certain force on every other atom. This means there would be $N \cdot (N-1)$ calculations for a system with $N$ atoms, making the whole system $O(N^2)$. Fortunately, forces between two atoms weaken as the distance between them increases, therefore, become negligible.

Instead of computing such weak forces, we only compute the forces if the atoms are really close to each other. We do this using a grid system, of cubes of a certain side length. This side length can be specified when the system is instantiated, or is automatically computed given the number of initial particles by the following derived formula:

$\sqrt[3]{k \cdot \frac{D_xD_yD_z}{N}}$

$D$ is the size of the system in that direction, and $N$ is the number of initial particles. $k$ is an arbitrary value, that need only be greater than twice the largest van der waals radius. In the program, if this value falls below 10, it gets set to 10.

float volume = dimensionX * dimensionY * dimensionZ;
float n = static_cast<float>(this->atoms.size());

this->cellSize = std::max(10.0f, std::cbrt(volume / n) * cellSizeMultiplier);

Atom

The core component of the program, the atom class represents, as the name suggests, an atom. There are some extra variables and structures defined in the atom.h file:

constexpr double dt = 1e-6;

struct Bond {
    Atom* partner;
    int order;
};

struct LonePair {
	Vector3 relativePos;
};

The rest of the atom class is trivial, and thus we shall not discuss them here.

System

Every tick, the system moves forward by dt time, which is defined in the atom.h file. The program is informed to do this in a run function.

The following is a breakdown of the run function.

Upgrade Bond Order

Atoms with valence electrons but nothing to bond to are hungry for more bonds. For example: Ethene ($C_2H_2$), if merely instantiated in the code, would form only a single bond between the two carbon atoms. This leaves a single valence electron for both carbon atoms, that get satisfied if they make a double bond.

This function does exactly that - check if any atom with a bond has a valence electron, and is also bonded to something else that also has a valence electron. If that is true, it would increase the bond order.

Valence Forces

Molecular bonds have a bond angle because of the valence forces exerted by each atom and lone pair in a molecule. For efficiency reasons, we simulate pseudo lone pairs only for atoms that have 2 or more bonds, because they are the only ones that matter for the force calculations.

This function computes the lone pair and bond pair repulsion forces using Coulomb's law. Note that magic numbers within the code: 1e-4 and 5e5 were put in place of other forces that act other than Coulomb's law. These forces are unfortunately too compilcated and slow to compute, therefore are replaced with these magic numbers.

Check Bond Breaks

This function just checks if two atoms are moving too fast away from each other, or if they are too far from each other, that their bond should be broken. It judges distance based on the sum of the covalent radii of the two atoms, but a multiplier, bond_break_factor, which, if exceeded, deletes the bond.

Bond Forces

Applies a spring force, where the mean position is the sum of the two particles' covalent radii. The higher the bond order, the closer that mean position is, by multiplying the desired distance by some value, defined as follows:

constexpr float bondDistanceMultiplier[3] = {
    1.0f,   // For single bonds
    0.85f,  // Double bonds
    0.75f   // Triple bonds
};

Remember that any of these values can be changed. The program then computes a spring force, where the spring constant is stiffness, then adds that force to the total force of both atom.

Coulomb's force

Coulomb's law is applied between every two atoms close enough for it to be meaningful. Note that this program only computes the repulsive force, and adds that to the two atoms. If two atoms were to attract each other by Coulomb's force, this would fail because of the std::abs present in the force calculation.

The magic number 1e6 was to get the atoms in a molecule to repel each other enough to not take years to see a molecule reach its stable structure.

Collision force

Atoms in the universe don't really collide, rather this is a really powerful force exerted by the overlapping of van der waals radii, and works similar to bond forces, except this is pure repulsion, and much more stronger.

This force is disabled for bonded atoms.

Check New Bonds

At the end, we finally check if any new bonds can be formed. If the covalent radii of two atoms overlap, it decrees a bond between the two, through a makeBond function.

The makeBond function first creates the bond in the bond vector and bond set (former is useful for iterating, latter for look-ups) for both atoms involved. Then, it transfers the charge, difference in electronegativity multiplied by the defined polarizability.

float change = (p1->electronegativity - p2->electronegativity) * polarizability;