Skip to content

Commit 91a6be8

Browse files
authored
Merge pull request #9 from wazder/copilot/add-copilot-review-folder
Add comprehensive project audit with 50+ identified issues
2 parents 0324757 + 73dc649 commit 91a6be8

12 files changed

Lines changed: 4956 additions & 0 deletions
Lines changed: 204 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,204 @@
1+
# 🔴 CRITICAL ISSUES
2+
3+
These issues must be fixed as they cause or can cause runtime errors, type checking failures, or data loss.
4+
5+
## 1. Type Annotation Error in ZuCo Dataset
6+
7+
**Severity**: 🔴 CRITICAL
8+
**File**: `src/data/zuco_dataset.py`
9+
**Line**: 174
10+
11+
### Issue
12+
```python
13+
def get_dataset_info(self) -> Dict[str, any]: # ❌ WRONG
14+
```
15+
16+
### Problem
17+
- Uses lowercase `any` instead of `Any`
18+
- `any` is a built-in function, not a type annotation
19+
- Type checkers (mypy) will fail
20+
- IDE type hints won't work properly
21+
22+
### Fix
23+
```python
24+
def get_dataset_info(self) -> Dict[str, Any]: # ✓ CORRECT
25+
```
26+
27+
### Impact
28+
- Type checking fails
29+
- CI/CD pipeline mypy checks will fail
30+
- Developer experience degraded (no proper type hints)
31+
32+
---
33+
34+
## 2. Task Directory Naming Inconsistency
35+
36+
**Severity**: 🔴 CRITICAL
37+
**Files**: Multiple scripts
38+
39+
### Issue
40+
Scripts use inconsistent task naming conventions:
41+
- Some use: `task1_SR`, `task2_NR`, `task3_TSR` (underscores)
42+
- Others use: `task1-SR`, `task2-NR`, `task3-TSR` (hyphens)
43+
44+
### Affected Files
45+
| File | Format Used |
46+
|------|-------------|
47+
| `scripts/download_zuco.py` | underscores |
48+
| `scripts/generate_synthetic_data.py` | underscores |
49+
| `scripts/train_with_real_zuco.py` | **hyphens** |
50+
| `scripts/inspect_zuco_mat.py` | **hyphens** |
51+
| `scripts/download_zuco_manual.sh` | **hyphens** |
52+
| `scripts/download_zuco_simple.sh` | mixed |
53+
54+
### Problem
55+
```python
56+
# In train_with_real_zuco.py (line ~42-45)
57+
task_dirs = ['task1-SR', 'task2-NR', 'task3-TSR'] # hyphens
58+
59+
# But download_zuco.py creates:
60+
task_dirs = ['task1_SR', 'task2_NR', 'task3_TSR'] # underscores
61+
```
62+
63+
### Impact
64+
- Runtime `FileNotFoundError` when scripts expect one format but data uses another
65+
- Training scripts won't find downloaded data
66+
- Manual intervention required to rename directories
67+
68+
### Fix
69+
Standardize on one convention (recommend underscores to match Python naming):
70+
- Update all scripts to use `task1_SR` format
71+
- Or update all scripts to use `task1-SR` format consistently
72+
73+
---
74+
75+
## 3. Duplicate Configuration Files
76+
77+
**Severity**: 🔴 CRITICAL
78+
**Files**: `setup.cfg`, `pyproject.toml`
79+
80+
### Issue
81+
Both files define overlapping tool configurations with potentially conflicting values.
82+
83+
### Conflicts Found
84+
85+
#### pytest Markers
86+
- **setup.cfg** (line 22): Defines "benchmark" marker
87+
- **pyproject.toml** (lines 17-23): Omits "benchmark" marker
88+
89+
#### Coverage Settings
90+
- **setup.cfg** (line 30): Excludes `conftest.py`
91+
- **pyproject.toml**: Doesn't exclude `conftest.py`
92+
93+
### Problem
94+
- Tools may use different configuration files
95+
- Behavior becomes unpredictable
96+
- CI/CD results may differ from local development
97+
98+
### Impact
99+
- Test results inconsistent
100+
- Coverage reports differ between environments
101+
- Difficult to debug test issues
102+
103+
### Fix
104+
Choose one configuration format (recommend `pyproject.toml` as modern standard):
105+
1. Consolidate all tool configs into `pyproject.toml`
106+
2. Remove tool configurations from `setup.cfg`
107+
3. Keep only packaging metadata in `setup.cfg` if needed, or migrate entirely to `pyproject.toml`
108+
109+
---
110+
111+
## 4. Hardcoded User-Specific Path
112+
113+
**Severity**: 🔴 CRITICAL
114+
**File**: `docs/guides/RUN_ME_FIRST.md`
115+
**Line**: 8
116+
117+
### Issue
118+
```bash
119+
cd /Users/wazder/Documents/GitHub/NEST # ← User-specific path
120+
```
121+
122+
### Problem
123+
- Hardcoded path to specific user's machine
124+
- Won't work for any other user
125+
- Copy-paste error will confuse new users
126+
127+
### Impact
128+
- New users cannot follow quick start guide
129+
- First impression of project is negative
130+
- Documentation appears unprofessional
131+
132+
### Fix
133+
```bash
134+
cd /path/to/your/NEST # or simply:
135+
# Navigate to your NEST directory
136+
cd NEST
137+
```
138+
139+
---
140+
141+
## 5. Requirements File Duplication
142+
143+
**Severity**: 🔴 CRITICAL
144+
**Files**: `requirements.txt`, `requirements-dev.txt`
145+
146+
### Issue
147+
Development dependencies duplicated across both files:
148+
149+
**Duplicated packages** (15+ packages):
150+
- pytest (and all pytest plugins)
151+
- black, isort, flake8, mypy, pylint
152+
- bandit, safety
153+
- radon, coverage
154+
- sphinx, sphinx-rtd-theme
155+
156+
### Problem
157+
```text
158+
requirements.txt (lines 30-43):
159+
pytest>=7.0.0
160+
pytest-cov>=4.0.0
161+
black>=22.0.0
162+
flake8>=4.0.0
163+
...
164+
165+
requirements-dev.txt (lines 5-31):
166+
pytest>=7.0.0 # ← DUPLICATE
167+
pytest-cov>=4.0.0 # ← DUPLICATE
168+
black>=22.0.0 # ← DUPLICATE
169+
...
170+
```
171+
172+
### Impact
173+
- Installing `requirements.txt` installs dev tools in production
174+
- Larger Docker images
175+
- Potential security surface increase
176+
- Confusion about which file to use
177+
178+
### Fix
179+
1. Remove all development/testing tools from `requirements.txt`
180+
2. Keep only runtime dependencies in `requirements.txt`:
181+
- torch, transformers, lightning
182+
- numpy, scipy, scikit-learn
183+
- pyyaml, tqdm, wandb
184+
3. Keep all dev tools in `requirements-dev.txt`
185+
4. Update documentation to show:
186+
```bash
187+
pip install -r requirements.txt # Production
188+
pip install -r requirements-dev.txt # Development
189+
```
190+
191+
---
192+
193+
## Summary
194+
195+
These 5 critical issues should be addressed immediately:
196+
197+
1. ✅ Fix type annotation (`any``Any`)
198+
2. ✅ Standardize task directory naming
199+
3. ✅ Consolidate configuration files
200+
4. ✅ Fix hardcoded user path
201+
5. ✅ Separate production and dev dependencies
202+
203+
**Estimated effort**: 2-3 hours
204+
**Risk if not fixed**: Runtime errors, failed builds, user confusion

0 commit comments

Comments
 (0)