diff --git a/README.md b/README.md
index b0f6ed3..c986033 100755
--- a/README.md
+++ b/README.md
@@ -1,44 +1,61 @@
-# [project name]
-
-## Project Goal
-:point_right: Replace this text with a pithy description of the goal of the data analysis project.
-
-
-## Background & Design
-:point_right: Replace this text with a slightly longer (but still only max 250 words) description of the project background and its design. 
-
-
-## Roadmap
-:point_right: Insert graphic of planned project releases. [This site](https://app.diagrams.net/) allows you to build a graphic that is connected to GitHub, so that changes you make are treated as GitHub commits.
-
-
-## Contents
-:point_right: This section should introduce the content that is published on `main`. Example:
-
-
-#### 2021-07-01 SPR Poster
-This is a description of the poster. There is a matching folder in the repo.
-
-:point_right: When you first set up this readme file, you might replace everything under "Contents" here with something like "Watch for our first release!"
-
-
-## Work in Development
-This `main` branch contains completed releases for this project. For all work-in-progress, please switch over to the `dev` branches.
-
-:point_right: Keep the "Work in Development" text above as the default NDCLab "blurb" to help visitors navigate the repo. But delete this note before publishing the readme file.
-
-
-## Contributors
-| Role | Name |
-| ---  | ---  |
-| add role | insert team member(s) |
-| add role | insert team member(s) |
-| add role | insert team member(s) |
-
-Learn more about us [here](https://www.ndclab.com/people).
-
-
-## Contributing
-If you are interested in contributing, please read our [CONTRIBUTING.md](CONTRIBUTING.md) file.
-
-:point_right: Keep the "Contributing" text above as the default NDCLab "blurb" to help visitors navigate the repo. But delete this note before publishing the readme file.
+# Thrive: Theta DDM Analysis
+
+## Project Goal
+This repository contains the data analysis pipeline for the "Thrive" project. The goal is to investigate the neural mechanisms underlying social influence on decision-making using the Shrinking Spotlight Protocol (SSP) Drift Diffusion Model (DDM) and EEG time-frequency analysis. We aim to understand how social observation affects cognitive control processes, specifically examining ERP components (ERN, Pe) and theta band oscillations.
+
+## Background & Design
+The study employs a Flanker task with social (Observed) and non-social (Alone) conditions.
+Key analyses include:
+1.  **Behavioral Analysis**: Analyzing accuracy and reaction times (RT) to model decision-making processes.
+2.  **Computational Modeling**: Fitting the SSP-DDM to behavioral data to estimate parameters like boundary separation, non-decision time, and attentional focus.
+3.  **EEG Analysis**:
+    *   **Preprocessing**: Automated pipeline (MADE) including filtering, artifact rejection (FASTER/ADJUST), and ICA.
+    *   **ERP Analysis**: Computing Event-Related Potentials (ERN, Pe) for error and correct trials.
+    *   **Time-Frequency Analysis**: Examining power and phase synchrony (ITPS, ICPS) in the theta and delta bands.
+
+## Directory Structure
+
+```yml
+project-name
+├── code
+│   ├── behavior            # Scripts for behavioral data analysis and cleaning
+│   ├── ddm                 # Scripts for SSP-DDM model fitting (R, C++, Python)
+│   ├── figures             # Scripts to generate plots (ERPs, Topographies)
+│   ├── matlab              # Helper MATLAB functions
+│   ├── postprocessing      # Scripts for aggregating EEG metrics (ERP, TF, ICPS)
+│   ├── preprocessing-eeg   # EEG preprocessing pipeline (MADE)
+│   └── statistics          # R scripts for statistical analysis (LMM)
+├── derivatives             # Generated data (preprocessed EEG, summary CSVs)
+├── sourcedata              # Raw input data (checked)
+└── results                 # Figures and statistical outputs
+```
+
+## Setup & Usage
+
+### Prerequisites
+*   **Python**: `pandas`, `numpy`, `scipy`, `mne`, `matplotlib`, `h5py`
+*   **R**: `DEoptim`, `Rcpp`, `dplyr`, `flextable`
+*   **MATLAB**: EEGLAB with plugins (FASTER, ADJUST, firfilt)
+*   **C++**: Compiler compatible with Rcpp
+
+### Workflow
+1.  **Behavioral Processing**:
+    *   Run `code/behavior/behavior_analysis.py` to aggregate PsychoPy data.
+    *   Run `code/behavior/create_valid_behav.py` to filter subjects and create summary CSVs.
+
+2.  **DDM Fitting**:
+    *   Compile the C++ model: `code/ddm/simSSP_model_GB_noScale.cpp`.
+    *   Run `code/ddm/run_ddm_batch.py` to submit fitting jobs.
+    *   Aggregate results with `code/ddm/fitted.py`.
+
+3.  **EEG Preprocessing**:
+    *   Use `code/preprocessing-eeg/run_MADE_batch.py` to submit preprocessing jobs (calls `MADE_pipeline.m`).
+    *   Verify outputs with `code/preprocessing-eeg/check_preprocessed_files.py`.
+
+4.  **Post-Processing & Statistics**:
+    *   Compute ERP means: `code/postprocessing/compute_erp_means.py`.
+    *   Compute TF metrics: `code/postprocessing/compute_means_TF.py`.
+    *   Run statistical models using R scripts in `code/statistics/`.
+
+## Contributors
+*   NDCLab Team
diff --git a/code/behavior/behavior_analysis.py b/code/behavior/behavior_analysis.py
index 472116e..fcc202d 100755
--- a/code/behavior/behavior_analysis.py
+++ b/code/behavior/behavior_analysis.py
@@ -7,9 +7,41 @@
 import time
 import datetime
 
+"""
+Behavioral analysis script for processing PsychoPy data.
+
+This script reads behavioral data from PsychoPy CSV files, performs data cleaning,
+calculates various behavioral metrics (accuracy, reaction times, error rates),
+and generates summary CSV files for further analysis. It processes data for both
+social and non-social conditions.
+
+Usage:
+    python behavior_analysis.py <session_id>
+
+Arguments:
+    session_id (str): The session identifier (e.g., 's1_r1').
+
+Outputs:
+    - Summary CSV file containing aggregated metrics for each subject.
+    - Individual trial data CSV files for each subject.
+    - A combined CSV file containing trial data for all subjects.
+    - A log file recording the processing details.
+"""
+
 pd.options.mode.chained_assignment = None
 
 def convert_to_list_rt(series):
+    """
+    Converts a pandas Series of reaction times (RT) to a list of floats.
+
+    Handles string representations of lists and single values, as well as actual lists.
+
+    Args:
+        series (pd.Series): A pandas Series containing RT data, potentially as strings or lists.
+
+    Returns:
+        list: A list of float RT values, with np.nan for missing or invalid entries.
+    """
     float_list = []
     for value in series:
         if isinstance(value, str):
@@ -24,6 +56,17 @@ def convert_to_list_rt(series):
     return float_list
 
 def convert_to_list_resp(series):
+    """
+    Converts a pandas Series of response keys to a list of response values.
+
+    Parses string representations of lists to extract integer response keys.
+
+    Args:
+        series (pd.Series): A pandas Series containing response data.
+
+    Returns:
+        list: A list of lists containing integer response keys, or np.nan for missing entries.
+    """
     resp_list = []
     for value in series:
         if isinstance(value, str):
@@ -35,6 +78,15 @@ def convert_to_list_resp(series):
 
 
 def sort_csvs_by_date_pd(csv_paths):
+    """
+    Sorts a list of CSV file paths based on the date in the 'date' column of the first row.
+
+    Args:
+        csv_paths (list): A list of file paths to CSV files.
+
+    Returns:
+        list: The list of file paths sorted by date.
+    """
     date_format = "%Y-%m-%d_%Hh%M.%S.%f"
 
     def get_first_date(path):
diff --git a/code/behavior/check_status.py b/code/behavior/check_status.py
index d27264e..aa91d4b 100755
--- a/code/behavior/check_status.py
+++ b/code/behavior/check_status.py
@@ -3,6 +3,20 @@
 import re
 from pathlib import Path
 
+"""
+Script to check the processing status of behavioral data for a given session.
+
+This script scans the source directory for raw PsychoPy files, checks for
+deviation files, identifies successfully processed subjects, and reports
+which subjects are pending processing or have deviations.
+
+Usage:
+    python check_status.py <session_id>
+
+Arguments:
+    session_id (str): The session identifier (e.g., 's1_r1').
+"""
+
 def get_args():
     """Parses command line arguments."""
     parser = argparse.ArgumentParser(description="Check behavior processing status.")
diff --git a/code/behavior/check_subject_csv.py b/code/behavior/check_subject_csv.py
index ebbecb8..c01b760 100755
--- a/code/behavior/check_subject_csv.py
+++ b/code/behavior/check_subject_csv.py
@@ -2,6 +2,17 @@
 import re
 import glob
 
+"""
+Script to verify the integrity of subject CSV files in the source directory.
+
+It checks if each subject has the expected number of files (3) and detects
+any deviation files. It asserts that the main data CSV file exists and does
+not have 'deviation' in its filename.
+
+Usage:
+    python check_subject_csv.py
+"""
+
 input_dataset_path = "/home/data/NDClab/datasets/thrive-dataset/"
 data_path = "sourcedata/checked/"
 sub_path = "s1_r1/psychopy/"
diff --git a/code/behavior/create_valid_behav.py b/code/behavior/create_valid_behav.py
index 910ef2b..2a86501 100755
--- a/code/behavior/create_valid_behav.py
+++ b/code/behavior/create_valid_behav.py
@@ -4,12 +4,35 @@
 from glob import glob
 import os
 
+"""
+Script to filter and clean behavioral data, creating valid datasets for analysis.
+
+This script loads the behavioral summary data, removes excluded subjects and
+outliers based on reaction time and accuracy criteria, and saves separate CSV
+files for social and non-social conditions containing only valid data.
+
+Usage:
+    python create_valid_behav.py <session_id>
+
+Arguments:
+    session_id (str): The session identifier (e.g., 's1_r1').
+"""
+
 
 def replace_outliers_with_nan_cols(df, columns_to_check, sd_thresh=3):
     """
-    Replaces outliers with NaN.
-    Raises KeyError if a column is missing.
-    Returns the modified dataframe if successful.
+    Replaces outliers in specified columns with NaN based on standard deviation threshold.
+
+    Args:
+        df (pd.DataFrame): The input pandas DataFrame.
+        columns_to_check (list): A list of column names to check for outliers.
+        sd_thresh (float, optional): The number of standard deviations to use as the threshold. Defaults to 3.
+
+    Returns:
+        pd.DataFrame: The DataFrame with outliers replaced by NaN.
+
+    Raises:
+        KeyError: If a column in columns_to_check is missing from the DataFrame.
     """
     # 1. Validation: Check all columns exist first
     # If this fails, the error is raised and nothing is returned.
diff --git a/code/ddm/SSP_DDM_fitting.R b/code/ddm/SSP_DDM_fitting.R
index 85b0602..4e3c653 100755
--- a/code/ddm/SSP_DDM_fitting.R
+++ b/code/ddm/SSP_DDM_fitting.R
@@ -1,3 +1,22 @@
+#' SSP-DDM Model Fitting Script
+#'
+#' This script loads human behavioral data and fits the Shrinking Spotlight Protocol (SSP)
+#' Drift Diffusion Model (DDM) parameters to the data using Differential Evolution optimization.
+#' It relies on an Rcpp function for model simulation.
+#'
+#' @description
+#' The script defines a fitting function that calls the C++ simulation model,
+#' compares simulated data with human data using Chi-Square statistics, and optimizes
+#' parameters (a, ter, p, rd, sda) to minimize the difference.
+#' It processes subjects in batches as defined by command line arguments.
+#'
+#' @usage
+#' Rscript SSP_DDM_fitting.R <startIdx> <endIdx> <session>
+#'
+#' @param startIdx The starting index of subjects to process.
+#' @param endIdx The ending index of subjects to process.
+#' @param session The session identifier (e.g., 's1_r1').
+
 ## This script loads in human data and then proceeds through the fitting Script to fit parameters to the data. 
 ## The first section of code is a function that call the rcpp code (separate file), which contains the actual model,
 ## simulates data based on a set of parameters, and outputs a fit stat. The second section of code is the "main script"
@@ -16,6 +35,24 @@ endIdx <- args[2]
 session <- args[3]
 #------------------------------------------------------------------------------
 # Fit function for the SSP model
+
+#' Fit Function for the SSP Model
+#'
+#' Calculates the Chi-Square goodness-of-fit statistic between human data and
+#' simulated data generated by the SSP model for a given set of parameters.
+#'
+#' @param parms Numeric vector of length 5 containing the model parameters:
+#'   1. a (boundary separation)
+#'   2. ter (non-decision time)
+#'   3. p (perceptual strength)
+#'   4. rd (shrinking rate)
+#'   5. sda (attentional window width)
+#' @param nTrials Integer. Number of trials to simulate per condition.
+#' @param cutPoints List of length 4. RT bin boundaries for each accuracy/congruency condition.
+#' @param humanProps List of length 4. Proportions of human data in each bin.
+#' @param HumanTrialCounts Numeric vector of length 2. Total trial counts for congruent and incongruent conditions.
+#'
+#' @return A numeric value representing the Chi-Square statistic. Returns a very large number if infinite.
 fitFunctionSSP <- function(
     parms, # generated by DEoptim within specified boundaries when DEoptim is called
     nTrials, # how many trials to simulate per condition 
diff --git a/code/ddm/bulk_scancel.py b/code/ddm/bulk_scancel.py
index 85b9599..14257bd 100755
--- a/code/ddm/bulk_scancel.py
+++ b/code/ddm/bulk_scancel.py
@@ -1,5 +1,15 @@
 import subprocess
 
+"""
+Script to bulk cancel SLURM jobs within a specified range of job IDs.
+
+This script generates a list of job IDs from a start ID to an end ID and
+executes the 'scancel' command for all of them.
+
+Usage:
+    python bulk_scancel.py
+"""
+
 start_id = 2648371
 end_id = 2648451
 
diff --git a/code/ddm/fitted.py b/code/ddm/fitted.py
index 12c5d66..b60a27b 100755
--- a/code/ddm/fitted.py
+++ b/code/ddm/fitted.py
@@ -3,6 +3,19 @@
 from datetime import datetime
 import sys
 
+"""
+Script to aggregate DDM fitting results into a single CSV file.
+
+This script searches for output files from DDM fitting, concatenates them,
+and saves a summary file listing the subjects that have been successfully fitted.
+
+Usage:
+    python fitted.py <session_id>
+
+Arguments:
+    session_id (str): The session identifier.
+"""
+
 session = sys.argv[1]
 current_datetime = datetime.now()
 formatted_date = current_datetime.strftime("%Y_%m_%d_%H_%M_%S")
diff --git a/code/ddm/run_ddm_batch.py b/code/ddm/run_ddm_batch.py
index 647d6dd..0126b46 100755
--- a/code/ddm/run_ddm_batch.py
+++ b/code/ddm/run_ddm_batch.py
@@ -6,6 +6,19 @@
 from pathlib import Path
 import pandas as pd
 
+"""
+Script to submit DDM fitting jobs to the SLURM scheduler.
+
+This script identifies subjects that have not yet been fitted for the DDM model
+and submits batch jobs for them. It allows for submitting jobs in batches.
+
+Usage:
+    python run_ddm_batch.py <session_id>
+
+Arguments:
+    session_id (str): The session identifier.
+"""
+
 session = sys.argv[1]
 #session = "s2_r1"
 data_dir = "/home/data/NDClab/analyses/thrive-theta-ddm/"
diff --git a/code/ddm/simSSP_model_GB_noScale.cpp b/code/ddm/simSSP_model_GB_noScale.cpp
index 2be3455..d108a0b 100755
--- a/code/ddm/simSSP_model_GB_noScale.cpp
+++ b/code/ddm/simSSP_model_GB_noScale.cpp
@@ -1,7 +1,39 @@
+/**
+ * @file simSSP_model_GB_noScale.cpp
+ * @brief Implementation of the Shrinking Spotlight Protocol (SSP) Drift Diffusion Model (DDM) simulation.
+ *
+ * This file contains the Rcpp function for simulating trial-level data for the SSP-DDM model.
+ * It simulates the decision-making process by modeling drift diffusion with a shrinking attentional spotlight.
+ */
+
 #include <Rcpp.h>
 #include <Rcpp/Rmath.h>
 using namespace Rcpp;
 
+/**
+ * @brief Simulates trial-level results for the SSP-DDM model.
+ *
+ * This function simulates the response times and accuracy for a set of trials
+ * based on the provided model parameters. It uses a diffusion process where
+ * the drift rate is modulated by a shrinking attentional spotlight.
+ *
+ * @param parms Numeric vector of length 5 containing the SSP parameters:
+ *              - parms[0]: A (boundary separation)
+ *              - parms[1]: tEr (non-decision time)
+ *              - parms[2]: p (perceptual strength)
+ *              - parms[3]: rd (shrinking rate)
+ *              - parms[4]: sda (initial attentional window width)
+ * @param trialType Integer indicating the trial type:
+ *                  - 1: Congruent
+ *                  - 2: Incongruent
+ * @param nTrials Integer. The number of trials to simulate.
+ * @param dt Double. The time step size for the diffusion process simulation.
+ * @param vari Double. The variance of the drift rate for each step.
+ *
+ * @return NumericMatrix A matrix with `nTrials` rows and 2 columns:
+ *         - Column 0: Simulated Reaction Time (RT)
+ *         - Column 1: Simulated Accuracy (1 for correct, 0 for error)
+ */
 // [[Rcpp::export]]
 NumericMatrix simSSP_model_GBnoScale(NumericVector parms, int trialType, int nTrials,
 double dt, double vari) {
diff --git a/code/figures/plot_erp.m b/code/figures/plot_erp.m
index 0d1ed8a..9f4266e 100755
--- a/code/figures/plot_erp.m
+++ b/code/figures/plot_erp.m
@@ -1,3 +1,10 @@
+% PLOT_ERP Plot Grand Average ERPs
+%
+% Usage:
+%   Run this script to plot Grand Average ERP waveforms for Social vs Non-Social
+%   and Error vs Correct conditions. It handles both standard and CSD data.
+%   It relies on precomputed .mat files containing ERP data.
+
 clear % clear matlab workspace
 clc % clear matlab command window
 
diff --git a/code/figures/plot_erp.py b/code/figures/plot_erp.py
index 86f80f4..7027c33 100755
--- a/code/figures/plot_erp.py
+++ b/code/figures/plot_erp.py
@@ -11,7 +11,32 @@
 import sys
 import os
 
+"""
+Script to plot Grand Average ERP waveforms.
+
+This script loads aggregated ERP data, applies baseline correction,
+filters subjects based on valid behavioral data, and plots the Grand Average ERPs
+for Social vs Non-Social and Error vs Correct conditions. It specifically targets
+components like ERN and Pe.
+
+Usage:
+    python plot_erp.py <session_id> <laplacian_flag>
+
+Arguments:
+    session_id (str): The session identifier.
+    laplacian_flag (int): 1 for CSD data, 0 for voltage data.
+"""
+
 def find_newest_file(path): 
+    """
+    Finds the most recently modified file matching a glob pattern.
+
+    Args:
+        path (str): Glob pattern to match files.
+
+    Returns:
+        str: The path to the newest file found.
+    """
     matching_files = glob(path)
     
     # Check if any files were found
diff --git a/code/matlab/parsave.m b/code/matlab/parsave.m
index 82f3dae..b9e6e0e 100755
--- a/code/matlab/parsave.m
+++ b/code/matlab/parsave.m
@@ -1,4 +1,17 @@
 function parsave(fname, varargin)
+% PARSAVE Save variables to a file inside a parfor loop.
+%
+% Usage:
+%   parsave(fname, 'var1', val1, 'var2', val2, ...)
+%
+% Inputs:
+%   fname    - Name of the file to save (string/char).
+%   varargin - Alternating variable names (strings) and their values.
+%
+% Description:
+%   Matlab's 'save' function cannot be called directly inside a parfor loop
+%   because it accesses the workspace. This function serves as a wrapper.
+
     % varargin contains alternating variable names and values
     % Extract names and values
     for i = 1:2:length(varargin)
diff --git a/code/postprocessing/compute_erp_means.py b/code/postprocessing/compute_erp_means.py
index 07a7cbf..7bbbaca 100755
--- a/code/postprocessing/compute_erp_means.py
+++ b/code/postprocessing/compute_erp_means.py
@@ -10,7 +10,31 @@
 import sys
 import os
 
+"""
+Script to compute mean ERP amplitudes from preprocessed EEG data.
+
+This script loads ERP data from a MAT file, applies baseline correction,
+and calculates mean amplitudes for specific time windows and electrode clusters
+(e.g., ERN, Pe) for social and non-social conditions. It outputs the results to a CSV file.
+
+Usage:
+    python compute_erp_means.py <session_id> <laplacian_flag>
+
+Arguments:
+    session_id (str): The session identifier (e.g., 's1_r1').
+    laplacian_flag (int): 1 to use CSD (Current Source Density) transformed data, 0 for standard voltage.
+"""
+
 def find_newest_file(path): 
+    """
+    Finds the most recently modified file matching a glob pattern.
+
+    Args:
+        path (str): Glob pattern to match files.
+
+    Returns:
+        str: The path to the newest file found.
+    """
     matching_files = glob(path)
     
     # Check if any files were found
diff --git a/code/postprocessing/compute_means_ICPS.py b/code/postprocessing/compute_means_ICPS.py
index 1b1a205..d01997b 100755
--- a/code/postprocessing/compute_means_ICPS.py
+++ b/code/postprocessing/compute_means_ICPS.py
@@ -11,6 +11,21 @@
 import os
 import sys
 
+"""
+Script to compute mean Inter-Channel Phase Synchrony (ICPS) values.
+
+This script loads precomputed ICPS arrays (from MAT/HDF5 files), extracts mean values
+for specific frequency bands (theta, delta), time windows (early, late), and
+electrode clusters (DLPFC, OCC, MOTOR). It merges these values with behavioral
+summary data and saves the result to a CSV file.
+
+Usage:
+    python compute_means_ICPS.py <session_id>
+
+Arguments:
+    session_id (str): The session identifier.
+"""
+
 session = sys.argv[1]
 dataset_path = "/home/data/NDClab/analyses/thrive-theta-ddm/"
 
diff --git a/code/postprocessing/compute_means_TF.py b/code/postprocessing/compute_means_TF.py
index 0305813..183bc6c 100755
--- a/code/postprocessing/compute_means_TF.py
+++ b/code/postprocessing/compute_means_TF.py
@@ -11,6 +11,20 @@
 import os
 import sys
 
+"""
+Script to compute mean Time-Frequency (TF) power and Inter-Trial Phase Synchrony (ITPS).
+
+Similar to the ICPS script, this extracts mean values for TF power and ITPS
+from precomputed arrays for defined frequency bands, time windows, and channel clusters.
+Results are merged with behavioral data and saved to CSV.
+
+Usage:
+    python compute_means_TF.py <session_id>
+
+Arguments:
+    session_id (str): The session identifier.
+"""
+
 session = sys.argv[1]
 
 dataset_path = "/home/data/NDClab/analyses/thrive-theta-ddm/"
diff --git a/code/postprocessing/create_tf_arrays.py b/code/postprocessing/create_tf_arrays.py
index 060354d..47867b2 100755
--- a/code/postprocessing/create_tf_arrays.py
+++ b/code/postprocessing/create_tf_arrays.py
@@ -8,6 +8,20 @@
 import re
 import sys
 
+"""
+Script to aggregate individual Time-Frequency (TF) data into group-level arrays.
+
+This script iterates through all subjects and conditions, loads their individual
+TF output files (HDF5/MAT), and concatenates them into large matrices (Subject x Channel x Time x Freq).
+These aggregate arrays are saved for faster group-level analysis.
+
+Usage:
+    python create_tf_arrays.py <session_id>
+
+Arguments:
+    session_id (str): The session identifier.
+"""
+
 # This code creates big arrays of n_sub * chan * times * freqs data from individual time-frequecy data arrays by concatenating individual participants' data.
 
 #session = "s2_r1"
diff --git a/code/postprocessing/run_erp_processing.py b/code/postprocessing/run_erp_processing.py
index aad4b6d..05d1d9f 100755
--- a/code/postprocessing/run_erp_processing.py
+++ b/code/postprocessing/run_erp_processing.py
@@ -1,6 +1,16 @@
 import subprocess
 import time
 
+"""
+Script to submit batch jobs for ERP processing and plotting.
+
+This script iterates over defined sessions and conditions (Laplacian vs standard)
+and submits SLURM jobs for computing ERP means and generating ERP plots.
+
+Usage:
+    python run_erp_processing.py
+"""
+
 # Define your iteration parameters
 sessions = [
 "s1_r1",
diff --git a/code/postprocessing/run_tf_processing.py b/code/postprocessing/run_tf_processing.py
index 031bcbc..fc840b8 100755
--- a/code/postprocessing/run_tf_processing.py
+++ b/code/postprocessing/run_tf_processing.py
@@ -1,6 +1,16 @@
 import subprocess
 import time
 
+"""
+Script to submit batch jobs for Time-Frequency mean computation.
+
+This script submits SLURM jobs to run `compute_means_TF.py` and `compute_means_ICPS.py`
+for defined sessions.
+
+Usage:
+    python run_tf_processing.py
+"""
+
 # Define your iteration parameters
 sessions = [
 "s1_r1",
diff --git a/code/preprocessing-eeg/MADE_pipeline.m b/code/preprocessing-eeg/MADE_pipeline.m
index 02c8ce0..fd83829 100755
--- a/code/preprocessing-eeg/MADE_pipeline.m
+++ b/code/preprocessing-eeg/MADE_pipeline.m
@@ -1,5 +1,18 @@
 % Define the MADE processing pipeline as a function
 function [] = MADE_pipeline(dataset, subjects, session)
+% MADE_PIPELINE Main processing pipeline for EEG data (MADE).
+%
+% Usage:
+%   MADE_pipeline(dataset, subjects, session)
+%
+% Inputs:
+%   dataset  - Name of the dataset folder (e.g., 'thrive-dataset').
+%   subjects - String with subject IDs separated by slashes (e.g., '3000001/3000002').
+%   session  - Session ID (e.g., 's1_r1').
+%
+% Description:
+%   Executes the full preprocessing pipeline including filtering, artifact rejection (FASTER/ADJUST),
+%   ICA, and epoching.
 
 % dataset: str like 'thrive-dataset'
 % subjects: str like '3000001/3000002/3000003/3000004'
diff --git a/code/preprocessing-eeg/check_preprocessed_files.py b/code/preprocessing-eeg/check_preprocessed_files.py
index 3a71fa8..cabc02c 100755
--- a/code/preprocessing-eeg/check_preprocessed_files.py
+++ b/code/preprocessing-eeg/check_preprocessed_files.py
@@ -2,7 +2,27 @@
 from pathlib import Path
 import sys
 
+"""
+Script to verify the existence and count of preprocessed EEG files.
+
+This script scans the derivatives directory for preprocessed EEG data (.set and .fdt files)
+and alerts if a subject has more than the expected number of files, which might indicate
+duplicates or processing issues.
+
+Usage:
+    python check_preprocessed_files.py [root_dir]
+
+Arguments:
+    root_dir (str, optional): Root directory of the dataset. Defaults to standard path.
+"""
+
 def check_eeg_files(root_dir):
+    """
+    Scans the directory structure for EEG files and reports counts per subject.
+
+    Args:
+        root_dir (str): The root directory to start the scan from.
+    """
     root_path = Path(root_dir)
     
     if not root_path.exists():
diff --git a/code/preprocessing-eeg/compare_file_sizes.py b/code/preprocessing-eeg/compare_file_sizes.py
index a910f61..1735e31 100755
--- a/code/preprocessing-eeg/compare_file_sizes.py
+++ b/code/preprocessing-eeg/compare_file_sizes.py
@@ -3,6 +3,13 @@
 from pathlib import Path
 import pandas as pd
 
+"""
+Script to compare file sizes between two directory trees.
+
+Useful for verifying data integrity after copying or moving files, specifically
+checking if processed files in the analysis directory match those in the dataset directory.
+"""
+
 def compare_subject_files(root_path_a, root_path_b, specific_subpath="**/*"):
     """
     Iterates through subject folders (sub-*) in root_path_a, finds the corresponding
@@ -14,6 +21,9 @@ def compare_subject_files(root_path_a, root_path_b, specific_subpath="**/*"):
         specific_subpath (str): Glob pattern to limit search inside subject folders.
                                 Use "**/*" for everything.
                                 Use "**/eeg/*" to only look inside eeg folders.
+
+    Returns:
+        pd.DataFrame: A DataFrame containing details of any size mismatches or missing files.
     """
     base_a = Path(root_path_a)
     base_b = Path(root_path_b)
diff --git a/code/preprocessing-eeg/count_trials.py b/code/preprocessing-eeg/count_trials.py
index 806a239..82ceb63 100755
--- a/code/preprocessing-eeg/count_trials.py
+++ b/code/preprocessing-eeg/count_trials.py
@@ -10,19 +10,33 @@
 import datetime
 import time
 
+"""
+Script to count trials in preprocessed EEG data based on various conditions.
+
+This script loads EEG data (.set files), extracts events, and counts trials
+categorized by accuracy, congruency, and observation condition (social/non-social).
+It outputs a CSV with trial counts for each subject.
+
+Usage:
+    python count_trials.py
+"""
+
 def eeg_point2lat(lat_array, epoch_array, srate, timewin=None, timeunit=1):
     """
     Convert latency in data points to latency in ms relative to the time locking.
 
-    Parameters:
-    lat_array (array-like): Latency array in data points assuming concatenated data epochs.
-    epoch_array (array-like or None): Epoch number corresponding to each latency value.
-    srate (float): Data sampling rate in Hz.
-    timewin (list or None): [min, max] time limits in 'timeunit' units. Default is None.
-    timeunit (float): Time unit in seconds. Default is 1 (seconds).
+    Args:
+        lat_array (array-like): Latency array in data points assuming concatenated data epochs.
+        epoch_array (array-like or None): Epoch number corresponding to each latency value.
+        srate (float): Data sampling rate in Hz.
+        timewin (list or None): [min, max] time limits in 'timeunit' units. Default is None.
+        timeunit (float): Time unit in seconds. Default is 1 (seconds).
 
     Returns:
-    numpy.ndarray: Converted latency values (in 'timeunit' units) for each epoch.
+        numpy.ndarray: Converted latency values (in 'timeunit' units) for each epoch.
+
+    Raises:
+        ValueError: If input arrays have inconsistent lengths or timewin is invalid.
     """
     
     lat_array = np.array(lat_array)
@@ -57,6 +71,14 @@ def eeg_point2lat(lat_array, epoch_array, srate, timewin=None, timeunit=1):
     return np.round(newlat * 1e9) / 1e9
 
 def disp_diff_arr(arr1, arr2, round = 5):
+    """
+    Displays differences between two arrays.
+
+    Args:
+        arr1 (np.ndarray): First array.
+        arr2 (np.ndarray): Second array.
+        round (int): Number of decimal places to round to before comparison.
+    """
     mask = np.round(arr1, round) != np.round(arr2, round)
     
     diff_elements_1 = arr1[mask]
diff --git a/code/preprocessing-eeg/create_mat_csd_s1.m b/code/preprocessing-eeg/create_mat_csd_s1.m
index 69c8f13..9f83d6b 100755
--- a/code/preprocessing-eeg/create_mat_csd_s1.m
+++ b/code/preprocessing-eeg/create_mat_csd_s1.m
@@ -1,3 +1,9 @@
+% CREATE_MAT_CSD_S1 Extract ERPs from CSD data
+%
+% Usage:
+%   Run this script to load preprocessed CSD data (.set files), epoch based on
+%   conditions (social/nonsocial, error/correct), and save averaged ERPs to a .mat file.
+
 %
 % Modified on 2024/04/11 to process thrive dataset flanker data
 %
diff --git a/code/preprocessing-eeg/create_mat_s1.m b/code/preprocessing-eeg/create_mat_s1.m
index 4f89a8d..3335616 100755
--- a/code/preprocessing-eeg/create_mat_s1.m
+++ b/code/preprocessing-eeg/create_mat_s1.m
@@ -1,3 +1,9 @@
+% CREATE_MAT_S1 Extract ERPs from standard EEG data
+%
+% Usage:
+%   Run this script to load preprocessed EEG data (.set files), epoch based on
+%   conditions, and save averaged ERPs to a .mat file.
+
 %
 % Modified on 2024/04/11 to process thrive dataset flanker data
 %
diff --git a/code/preprocessing-eeg/create_valid_eeg.py b/code/preprocessing-eeg/create_valid_eeg.py
index 6743027..5baf411 100755
--- a/code/preprocessing-eeg/create_valid_eeg.py
+++ b/code/preprocessing-eeg/create_valid_eeg.py
@@ -4,12 +4,35 @@
 from glob import glob
 import os
 
+"""
+Script to create valid EEG datasets by filtering subjects and trials.
+
+This script filters behavioral summary data to exclude subjects based on specific
+EEG exclusion criteria and performance thresholds. It creates cleaned CSV files
+for social and non-social conditions.
+
+Usage:
+    python create_valid_eeg.py <session_id>
+
+Arguments:
+    session_id (str): The session identifier.
+"""
+
 
 def replace_outliers_with_nan_cols(df, columns_to_check, sd_thresh=3):
     """
-    Replaces outliers with NaN.
-    Raises KeyError if a column is missing.
-    Returns the modified dataframe if successful.
+    Replaces outliers in specified columns with NaN based on standard deviation threshold.
+
+    Args:
+        df (pd.DataFrame): The input pandas DataFrame.
+        columns_to_check (list): A list of column names to check for outliers.
+        sd_thresh (float, optional): The number of standard deviations to use as the threshold. Defaults to 3.
+
+    Returns:
+        pd.DataFrame: The DataFrame with outliers replaced by NaN.
+
+    Raises:
+        KeyError: If a column in columns_to_check is missing from the DataFrame.
     """
     # 1. Validation: Check all columns exist first
     # If this fails, the error is raised and nothing is returned.
diff --git a/code/preprocessing-eeg/deviation_output.py b/code/preprocessing-eeg/deviation_output.py
index ab73821..33ddb23 100755
--- a/code/preprocessing-eeg/deviation_output.py
+++ b/code/preprocessing-eeg/deviation_output.py
@@ -4,6 +4,20 @@
 import pandas as pd
 import sys
 
+"""
+Script to aggregate deviation reports from subject directories.
+
+This script scans for 'deviation.txt' files in the EEG directories of subjects,
+reads their content, and compiles a CSV report listing deviations and the number
+of EEG files present for each subject.
+
+Usage:
+    python deviation_output.py <session_id>
+
+Arguments:
+    session_id (str): The session identifier.
+"""
+
 session = sys.argv[1]
 base_dir = "/home/data/NDClab/datasets/thrive-dataset/sourcedata/checked/"
 subjects = sorted(glob.glob(os.path.join(base_dir, "sub-3*")))
diff --git a/code/preprocessing-eeg/diff_vhdr.py b/code/preprocessing-eeg/diff_vhdr.py
index bf3b59a..220d8cb 100755
--- a/code/preprocessing-eeg/diff_vhdr.py
+++ b/code/preprocessing-eeg/diff_vhdr.py
@@ -5,7 +5,24 @@
 from rich.table import Table
 from rich.syntax import Syntax
 
+"""
+Script to compare two BrainVision header (.vhdr) files.
+
+This script displays a rich, color-coded unified diff of two text files,
+intended for comparing VHDR files.
+
+Usage:
+    python diff_vhdr.py <file1.vhdr> <file2.vhdr>
+"""
+
 def print_diff_rich(file1_path, file2_path):
+    """
+    Prints a color-coded diff of two files to the console.
+
+    Args:
+        file1_path (str): Path to the first file.
+        file2_path (str): Path to the second file.
+    """
     console = Console()
 
     try:
diff --git a/code/preprocessing-eeg/edit_event_markers_thrive.m b/code/preprocessing-eeg/edit_event_markers_thrive.m
index ddc1b13..2b2d6d1 100755
--- a/code/preprocessing-eeg/edit_event_markers_thrive.m
+++ b/code/preprocessing-eeg/edit_event_markers_thrive.m
@@ -1,4 +1,15 @@
 function EEG = edit_event_markers_thrive(EEG)
+% EDIT_EVENT_MARKERS_THRIVE Add detailed event labels for the Thrive task.
+%
+% Usage:
+%   EEG = edit_event_markers_thrive(EEG)
+%
+% Inputs:
+%   EEG - EEGLAB data structure.
+%
+% Outputs:
+%   EEG - EEGLAB data structure with updated event field containing:
+%         observation, eventType, congruency, accuracy, rt, etc.
 %
 %%%%%% Label event markers%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 %
diff --git a/code/preprocessing-eeg/erp_check.m b/code/preprocessing-eeg/erp_check.m
index 74b2b35..008c53d 100755
--- a/code/preprocessing-eeg/erp_check.m
+++ b/code/preprocessing-eeg/erp_check.m
@@ -1,3 +1,9 @@
+% ERP_CHECK Check ERPs and Count Trials
+%
+% Usage:
+%   Run this script to count trials per condition for each subject and
+%   optionally plot ERPs and topographies for quality checks.
+
 %
 % Modified on 2024/04/11 to process thrive dataset flanker data
 %
diff --git a/code/preprocessing-eeg/load_and_merge_2.m b/code/preprocessing-eeg/load_and_merge_2.m
index 219d3f7..ca348fb 100755
--- a/code/preprocessing-eeg/load_and_merge_2.m
+++ b/code/preprocessing-eeg/load_and_merge_2.m
@@ -1,4 +1,15 @@
 function [EEG] =  load_and_merge(path, force_cut_list)
+% LOAD_AND_MERGE Load and merge raw EEG files.
+%
+% Usage:
+%   EEG = load_and_merge(path, force_cut_list)
+%
+% Inputs:
+%   path           - Directory containing .vhdr/.vmrk files.
+%   force_cut_list - Cell array of conditions to forcibly remove ('social', 'nonsocial').
+%
+% Outputs:
+%   EEG            - Merged EEGLAB data structure.
 
     if nargin < 2
         force_cut_list = {};
diff --git a/code/preprocessing-eeg/make_eeg_pieces.m b/code/preprocessing-eeg/make_eeg_pieces.m
index 81798c6..f5d5242 100755
--- a/code/preprocessing-eeg/make_eeg_pieces.m
+++ b/code/preprocessing-eeg/make_eeg_pieces.m
@@ -1,4 +1,14 @@
 function [eeg_pieces] = make_eeg_pieces(EEG)
+% MAKE_EEG_PIECES Split EEG data into segments based on boundary markers.
+%
+% Usage:
+%   eeg_pieces = make_eeg_pieces(EEG)
+%
+% Inputs:
+%   EEG        - EEGLAB data structure.
+%
+% Outputs:
+%   eeg_pieces - Cell array of EEGLAB structures, each representing a segment.
 
     % Step 1: Extract boundary event latencies
     [~, boundary_indices] = pop_selectevent(EEG, 'type', 'boundary');
diff --git a/code/preprocessing-eeg/not_processed_checked.py b/code/preprocessing-eeg/not_processed_checked.py
index 4413809..d4e1d37 100755
--- a/code/preprocessing-eeg/not_processed_checked.py
+++ b/code/preprocessing-eeg/not_processed_checked.py
@@ -5,6 +5,20 @@
 from pathlib import Path
 import sys
 
+"""
+Script to identify subjects that still need preprocessing.
+
+This script compares the list of subjects in the source directory (checked data)
+against the list of subjects in the derivatives directory (preprocessed data).
+It reports which subjects are pending, and which of those have recorded deviations.
+
+Usage:
+    python not_processed_checked.py <session_id>
+
+Arguments:
+    session_id (str): The session identifier.
+"""
+
 session = sys.argv[1]
 dataset_path = "/home/data/NDClab/datasets/thrive-dataset/"
 data_path = "derivatives/preprocessed/"
diff --git a/code/preprocessing-eeg/plot_topo.m b/code/preprocessing-eeg/plot_topo.m
index 198aa12..10bfdff 100755
--- a/code/preprocessing-eeg/plot_topo.m
+++ b/code/preprocessing-eeg/plot_topo.m
@@ -1,3 +1,9 @@
+% PLOT_TOPO Plot Topographic Maps of ERPs
+%
+% Usage:
+%   Run this script to load ERP data and plot topographic maps for specific
+%   time windows (e.g., 0-100 ms for ERN).
+
 clear % clear matlab workspace
 clc % clear matlab command window
 
diff --git a/code/preprocessing-eeg/preprocess_eeg_piece.m b/code/preprocessing-eeg/preprocess_eeg_piece.m
index e8eb2a3..4d0ffdc 100755
--- a/code/preprocessing-eeg/preprocess_eeg_piece.m
+++ b/code/preprocessing-eeg/preprocess_eeg_piece.m
@@ -1,6 +1,19 @@
 % this function is run inside MADE, so all EEGLab dependencies should be already imported and initialized
 
 function [bad_channels] = preprocess_eeg_piece(eeg_piece, channel_locations, stimulus_timeoffset)
+% PREPROCESS_EEG_PIECE Preprocess a single segment of EEG data.
+%
+% Usage:
+%   bad_channels = preprocess_eeg_piece(eeg_piece, channel_locations, stimulus_timeoffset)
+%
+% Inputs:
+%   eeg_piece           - EEGLAB structure for the segment.
+%   channel_locations   - Channel locations structure/file.
+%   stimulus_timeoffset - Time offset to apply to stimulus markers (ms).
+%
+% Outputs:
+%   bad_channels        - Indices of channels identified as bad by FASTER.
+
     down_sample = 1;
     sampling_rate = 1000;
     adjust_time_offset = 1;
diff --git a/code/preprocessing-eeg/run_MADE_batch.py b/code/preprocessing-eeg/run_MADE_batch.py
index 1f84d86..70a0c42 100755
--- a/code/preprocessing-eeg/run_MADE_batch.py
+++ b/code/preprocessing-eeg/run_MADE_batch.py
@@ -7,6 +7,20 @@
 from glob import glob
 from pathlib import Path
 
+"""
+Script to manage and submit batch jobs for EEG preprocessing (MADE pipeline).
+
+This script identifies subjects that need processing (source exists but derivatives do not),
+categorizes them by deviation status, and allows the user to submit SLURM jobs
+for a list of subjects.
+
+Usage:
+    python run_MADE_batch.py <session_id>
+
+Arguments:
+    session_id (str): The session identifier.
+"""
+
 # --- Configuration ---
 if len(sys.argv) < 2:
     print("Usage: python script.py <session>")
@@ -23,7 +37,13 @@
 
 def get_subjects(glob_pattern):
     """
-    Globs files and returns a set of unique subject IDs.
+    Finds unique subject IDs from file paths matching a glob pattern.
+
+    Args:
+        glob_pattern (str): The pattern to match files.
+
+    Returns:
+        set: A set of unique subject ID strings.
     """
     subjects = set()
     files = glob(glob_pattern)
@@ -70,6 +90,14 @@ def get_subjects(glob_pattern):
 # --- 3. Output ---
 
 def print_group(title, data_set, show_ids=True):
+    """
+    Prints a summary of a group of subjects.
+
+    Args:
+        title (str): The title of the group.
+        data_set (set): A set of subject IDs.
+        show_ids (bool, optional): Whether to list the individual IDs. Defaults to True.
+    """
     sorted_list = sorted(list(data_set))
     print("")
     print(f"{title}: {len(sorted_list)}")
diff --git a/code/preprocessing-eeg/stim_cnt_check_csv.m b/code/preprocessing-eeg/stim_cnt_check_csv.m
index d951d8b..6d64970 100755
--- a/code/preprocessing-eeg/stim_cnt_check_csv.m
+++ b/code/preprocessing-eeg/stim_cnt_check_csv.m
@@ -1,3 +1,9 @@
+% STIM_CNT_CHECK_CSV Count Stimulus Markers from Raw Data
+%
+% Usage:
+%   Run this script to scan raw VHDR files (specifically those with deviations)
+%   and count stimulus markers to verify data integrity. Saves results to CSV.
+
 %clear all;
 %clc;
 
diff --git a/code/preprocessing-eeg/stim_cnt_check_processed.m b/code/preprocessing-eeg/stim_cnt_check_processed.m
index d68d16d..aa8e51f 100755
--- a/code/preprocessing-eeg/stim_cnt_check_processed.m
+++ b/code/preprocessing-eeg/stim_cnt_check_processed.m
@@ -1,3 +1,9 @@
+% STIM_CNT_CHECK_PROCESSED Count Trials in Processed Data
+%
+% Usage:
+%   Run this script to scan processed .set files and count valid trials
+%   per condition. Saves results to CSV.
+
 %clear all;
 %clc;
 
diff --git a/code/preprocessing-eeg/stim_cnt_check_processed_csd.m b/code/preprocessing-eeg/stim_cnt_check_processed_csd.m
index 3a50a8a..066c54f 100755
--- a/code/preprocessing-eeg/stim_cnt_check_processed_csd.m
+++ b/code/preprocessing-eeg/stim_cnt_check_processed_csd.m
@@ -1,4 +1,10 @@
 
+% STIM_CNT_CHECK_PROCESSED_CSD Count Trials in Processed CSD Data
+%
+% Usage:
+%   Run this script to scan processed CSD .set files and count valid trials.
+%   Saves results to CSV.
+
 %clear all;
 %clc;
 
diff --git a/code/preprocessing-eeg/stim_cnt_check_thrive.m b/code/preprocessing-eeg/stim_cnt_check_thrive.m
index b1c9d0a..196089f 100755
--- a/code/preprocessing-eeg/stim_cnt_check_thrive.m
+++ b/code/preprocessing-eeg/stim_cnt_check_thrive.m
@@ -1,4 +1,15 @@
 function [stim_count_nonsoc, stim_count_soc] = stim_cnt_check_thrive(path)
+% STIM_CNT_CHECK_THRIVE Count stimulus markers in a directory.
+%
+% Usage:
+%   [stim_count_nonsoc, stim_count_soc] = stim_cnt_check_thrive(path)
+%
+% Inputs:
+%   path              - Directory containing .vhdr files.
+%
+% Outputs:
+%   stim_count_nonsoc - Count of non-social stimulus markers.
+%   stim_count_soc    - Count of social stimulus markers.
 
     stim_events_nonsoc = {'S 41', 'S 42', 'S 43', 'S 44'};
     stim_events_soc = {'S 51', 'S 52', 'S 53', 'S 54'};
diff --git a/code/preprocessing-eeg/transform_csd_s1.m b/code/preprocessing-eeg/transform_csd_s1.m
index e9eb338..a8e4d28 100755
--- a/code/preprocessing-eeg/transform_csd_s1.m
+++ b/code/preprocessing-eeg/transform_csd_s1.m
@@ -1,4 +1,10 @@
 
+% TRANSFORM_CSD_S1 Apply CSD Transformation
+%
+% Usage:
+%   Run this script to apply Current Source Density (CSD) transformation
+%   to preprocessed EEG data.
+
 %clear % clear matlab workspace
 cluster = parcluster('local');
 %% Setting up other things
diff --git a/code/readme.md b/code/readme.md
index 2355644..4a6443e 100755
--- a/code/readme.md
+++ b/code/readme.md
@@ -1,7 +1,30 @@
 # Code
 
-### Instructions
-This folder contains the code used to create post-processed derivatives, statistics, and figures.
+## Overview
+This directory contains all source code for the project, organized by analysis stage and modality.
 
+## Subdirectories
 
-### Project Notes
+### `behavior/`
+Python scripts for parsing, cleaning, and analyzing behavioral data from the Flanker task. Includes quality control checks (`check_status.py`, `check_subject_csv.py`).
+
+### `ddm/`
+Code for the Shrinking Spotlight Protocol (SSP) Drift Diffusion Model.
+*   **R/C++**: `SSP_DDM_fitting.R` and `simSSP_model_GB_noScale.cpp` implement the model fitting and simulation.
+*   **Python**: Scripts for batch submission and result aggregation.
+
+### `preprocessing-eeg/`
+The MADE (Maryland Analysis of Developmental EEG) preprocessing pipeline.
+*   **MATLAB**: Core pipeline functions (`MADE_pipeline.m`, `preprocess_eeg_piece.m`).
+*   **Python**: Batch management scripts (`run_MADE_batch.py`, `check_preprocessed_files.py`).
+
+### `postprocessing/`
+Scripts to extract measures from preprocessed EEG data.
+*   **ERPs**: `compute_erp_means.py`
+*   **Time-Frequency**: `create_tf_arrays.py`, `compute_means_TF.py`, `compute_means_ICPS.py`.
+
+### `figures/`
+Code to generate visualizations, such as Grand Average ERP plots (`plot_erp.py`, `plot_erp.m`).
+
+### `statistics/`
+R scripts for running Linear Mixed-Effects Models (LMM) and exporting results to APA-style tables.
diff --git a/code/statistics/find_newest_file.R b/code/statistics/find_newest_file.R
index 7025b87..5741696 100644
--- a/code/statistics/find_newest_file.R
+++ b/code/statistics/find_newest_file.R
@@ -1,3 +1,16 @@
+#' Find Newest File Utility
+#'
+#' Provides a function to locate the most recently modified file matching a specific pattern.
+
+#' Find the Newest File Matching a Pattern
+#'
+#' Searches for files matching the given glob pattern and returns the path
+#' to the file with the most recent modification time.
+#'
+#' @param path_pattern A string containing the glob pattern to match files (e.g., "path/to/*.csv").
+#'
+#' @return A string containing the path to the newest file, or NULL if no files are found.
+#' @export
 find_newest_file <- function(path_pattern) {
   # Sys.glob expands wildcards similarly to python's glob
   matching_files <- Sys.glob(path_pattern)
diff --git a/code/statistics/lmer_export_apa.R b/code/statistics/lmer_export_apa.R
index 8d3dfab..854c113 100644
--- a/code/statistics/lmer_export_apa.R
+++ b/code/statistics/lmer_export_apa.R
@@ -2,6 +2,21 @@ library(dplyr)
 library(tibble)
 library(flextable)
 
+#' LMER Export to APA Table
+#'
+#' Functions to format and export results from Linear Mixed-Effects Models (lmer)
+#' into APA-style Word documents.
+
+#' Export LMER Model Summary to APA-Style Word Document
+#'
+#' Extracts coefficients, confidence intervals, and p-values from an lmer model,
+#' formats them into a nice table, and saves it as a .docx file.
+#'
+#' @param model An lmer model object.
+#' @param path String. The file path where the .docx table should be saved.
+#'
+#' @return No return value, called for side effect of creating a file.
+#' @export
 lmer_export_apa <- function(model, path) {
   
   output_model <- as.data.frame(summary(model)$coefficients)
@@ -38,6 +53,14 @@ lmer_export_apa <- function(model, path) {
   
 }
 
+#' Rename Model Parameters for Publication
+#'
+#' Renames raw variable names from the model output into human-readable labels
+#' suitable for publication (e.g., "acc" -> "Accuracy"). Handles interaction terms.
+#'
+#' @param parameter_vector A character vector of parameter names to rename.
+#'
+#' @return A character vector of renamed parameters.
 rename_parameters <- function(parameter_vector) {
   
   # Define the mapping for single term renaming