Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions database/README.MD
Original file line number Diff line number Diff line change
@@ -1 +1,14 @@
This script generates synthetic data for users and request tables.

Comment on lines +1 to 2
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This README says the script generates data only for users/request tables, but generate_mock_data.py writes additional outputs (volunteer_details, volunteers_assigned, request_comments). Please update the description/output list so it reflects what the script actually generates.

Copilot uses AI. Check for mistakes.
How to run:

1. Install dependencies:
pip install faker pandas

2. Run script:
python generate_mock_data.py

Output:

- users.csv
- request.csv
Comment on lines +8 to +14
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The run command python generate_mock_data.py is ambiguous because the script lives under database/mock-data-generation/. Please clarify the correct invocation (e.g., run from repo root with the proper path) and optionally note that the output directory is database/mock_db/.

Suggested change
2. Run script:
python generate_mock_data.py
Output:
- users.csv
- request.csv
2. Run script from the repo root:
python database/mock-data-generation/generate_mock_data.py
Output (written to database/mock_db/):
- database/mock_db/users.csv
- database/mock_db/request.csv

Copilot uses AI. Check for mistakes.
29 changes: 29 additions & 0 deletions database/mock-data-generation/fix_foreign_keys.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
import pandas as pd
import random

# Load generated CSVs
Comment on lines +1 to +4
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description says fix_foreign_keys.py was removed because it’s no longer needed, but this PR adds it back. If FK integrity is now guaranteed by generate_mock_data.py, this script should be deleted (or the PR description updated to reflect why it’s still required).

Copilot uses AI. Check for mistakes.
users_df = pd.read_csv("../mock_db/users.csv")
request_df = pd.read_csv("../mock_db/request.csv")
comments_df = pd.read_csv("../mock_db/request_comments.csv")
Comment on lines +5 to +7
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This script assumes request.csv has a req_id column (see use of request_df['req_id'] later), but the committed database/mock_db/request.csv header is request_id,... (no req_id). Running this against the repo’s mock_db/ files will raise KeyError. Please align the fixer with the actual CSV schema or remove it if no longer needed.

Copilot uses AI. Check for mistakes.
volunteers_df = pd.read_csv("../mock_db/volunteer_details.csv")
assigned_df = pd.read_csv("../mock_db/volunteers_assigned.csv")

# Fix request table
request_df['req_user_id'] = request_df['req_user_id'].apply(lambda x: random.choice(users_df['user_id']))
request_df.to_csv("../mock_db/request.csv", index=False)

# Fix comments table
comments_df['req_id'] = comments_df['req_id'].apply(lambda x: random.choice(request_df['req_id']))
comments_df['commenter_id'] = comments_df['commenter_id'].apply(lambda x: random.choice(users_df['user_id']))
Comment on lines +16 to +17
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This script updates comments_df['req_id'] and comments_df['commenter_id'], but the committed database/mock_db/request_comments.csv header is comment_id,request_id,user_id,comment (no req_id/commenter_id). This will also raise KeyError unless the CSV schema is updated to match.

Suggested change
comments_df['req_id'] = comments_df['req_id'].apply(lambda x: random.choice(request_df['req_id']))
comments_df['commenter_id'] = comments_df['commenter_id'].apply(lambda x: random.choice(users_df['user_id']))
comments_df['request_id'] = comments_df['request_id'].apply(lambda x: random.choice(request_df['req_id']))
comments_df['user_id'] = comments_df['user_id'].apply(lambda x: random.choice(users_df['user_id']))

Copilot uses AI. Check for mistakes.
comments_df.to_csv("../mock_db/request_comments.csv", index=False)

# Fix volunteer details table
volunteers_df['user_id'] = volunteers_df['user_id'].apply(lambda x: random.choice(users_df['user_id']))
volunteers_df.to_csv("../mock_db/volunteer_details.csv", index=False)

# Fix volunteer assignments
assigned_df['request_id'] = assigned_df['request_id'].apply(lambda x: random.choice(request_df['req_id']))
assigned_df['volunteer_id'] = assigned_df['volunteer_id'].apply(lambda x: random.choice(volunteers_df['user_id']))
assigned_df.to_csv("../mock_db/volunteers_assigned.csv", index=False)

print("All foreign keys fixed successfully!")
Loading
Loading