Skip to content

Commit 1674c16

Browse files
feat: add fuzzing infrastructure replacement plan and severity normalization
- Added detailed roadmap for replacing custom fuzzer with ffuf, including timeline, architecture, and success metrics - Implemented severity normalization in database client to convert all values to lowercase for Go compatibility - Added documentation for severity normalization with examples and migration instructions - Created new unit tests to verify severity case handling - Updated database client code to normalize severity values
1 parent e508206 commit 1674c16

File tree

7 files changed

+943
-17
lines changed

7 files changed

+943
-17
lines changed

ROADMAP.md

Lines changed: 260 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -137,12 +137,268 @@ workers/tools/katana/
137137

138138
---
139139

140+
## 🎯 PLANNED: Fuzzing Infrastructure Replacement (Nov 2025)
141+
142+
**Status**: ⏳ PLANNED
143+
**Priority**: HIGH - Replace prototype fuzzer with industry-standard tool
144+
**Duration**: 2-3 weeks (16-18 hours)
145+
**Impact**: 10x performance improvement, production-grade reliability
146+
147+
### Problem Statement
148+
149+
**Current Custom Fuzzer Issues**:
150+
- **Code Quality**: 0% test coverage, broken interfaces, unsafe concurrency
151+
- **Performance**: 50-100 req/sec vs 500-2000 req/sec industry standard
152+
- **Completeness**: 3 advertised features not implemented (TODOs in code)
153+
- **Production Readiness**: 36-60 hours needed to make production-ready
154+
- **Maintenance**: High burden (single developer, no tests, hardcoded paths)
155+
156+
**Assessment**: Current fuzzer at PROTOTYPE stage, not suitable for production use.
157+
158+
### Solution: Hybrid Approach
159+
160+
**Replace 80% → Use ffuf** (Core fuzzing):
161+
- Directory/file discovery
162+
- Subdomain enumeration
163+
- Virtual host detection
164+
- Basic parameter fuzzing
165+
166+
**Keep 20% → Custom fuzzer** (Specialized attacks):
167+
- HTTP Parameter Pollution (HPP) detection
168+
- Type confusion testing
169+
- ML-based parameter prediction
170+
- Framework-aware fuzzing patterns
171+
172+
### Implementation Timeline
173+
174+
#### Week 1: ffuf Integration (6 hours)
175+
176+
**Day 1-2: Core Integration** (4 hours)
177+
- [ ] Add ffuf as git submodule: `workers/tools/ffuf`
178+
- [ ] Install ffuf Go dependencies: `go get github.com/ffuf/ffuf`
179+
- [ ] Create wrapper module: `internal/discovery/ffuf_integration.go`
180+
- Implement `DiscoveryModule` interface
181+
- Priority: 55 (runs after katana)
182+
- Handle directory, file, vhost, subdomain fuzzing
183+
184+
**Day 3: Engine Integration** (2 hours)
185+
- [ ] Register ffuf module with discovery engine
186+
- [ ] Add to `NewEngine()` in `internal/discovery/engine.go`
187+
- [ ] Configure hybrid CLI + fallback mode (like subfinder pattern)
188+
- [ ] Add graceful degradation when ffuf binary not available
189+
190+
#### Week 2: Custom Fuzzer Refactor (8-10 hours)
191+
192+
**Day 1-2: Fix Critical Issues** (6 hours)
193+
- [ ] Fix interface compliance: Add `Type()` and `Validate()` methods
194+
- [ ] Fix unsafe concurrency: Add mutex for body reading
195+
- [ ] Fix wordlist handling: Proper error reporting, configurable paths
196+
- [ ] Remove basic fuzzing code (now handled by ffuf)
197+
198+
**Day 3: Specialized Features Only** (2 hours)
199+
- [ ] Keep only specialized attack modules:
200+
- `pkg/fuzzing/hpp_detector.go` - HTTP Parameter Pollution
201+
- `pkg/fuzzing/type_confusion.go` - Type confusion testing
202+
- `pkg/fuzzing/ml_predictor.go` - ML-based predictions
203+
- [ ] Remove 80% of fuzzer code (basic operations)
204+
- [ ] Wire specialized fuzzer to run AFTER ffuf on interesting findings
205+
206+
**Day 4: Testing** (2 hours)
207+
- [ ] Add integration tests: `internal/discovery/ffuf_integration_test.go`
208+
- [ ] Test ffuf discovery module registration
209+
- [ ] Test hybrid CLI + fallback mode
210+
- [ ] Test specialized fuzzer only runs on high-value targets
211+
212+
#### Week 3: CLI & Documentation (4 hours)
213+
214+
**Day 1: Update CLI Commands** (2 hours)
215+
- [ ] Update `cmd/fuzz.go` to use ffuf backend by default
216+
- [ ] Add `--use-custom` flag for specialized fuzzing
217+
- [ ] Update help text to reflect new architecture
218+
- [ ] Add performance benchmarks to help text
219+
220+
**Day 2: Documentation** (2 hours)
221+
- [ ] Update CLAUDE.md with new fuzzing architecture
222+
- [ ] Document ffuf vs custom fuzzer use cases
223+
- [ ] Add installation instructions for ffuf binary
224+
- [ ] Update README.md fuzzing examples
225+
226+
### Architecture Diagram
227+
228+
```
229+
┌────────────────────────────────────────────────────────┐
230+
│ Shells Discovery Pipeline │
231+
├────────────────────────────────────────────────────────┤
232+
│ │
233+
│ Phase 1: Passive Reconnaissance │
234+
│ ├─ subfinder (90) → Subdomain enumeration │
235+
│ ├─ dnsx (85) → DNS resolution │
236+
│ └─ tlsx (80) → Certificate transparency │
237+
│ │
238+
│ Phase 2: Active Service Detection │
239+
│ ├─ httpx (70) → HTTP probing │
240+
│ └─ katana (60) → Web crawling │
241+
│ │
242+
│ Phase 3: Fuzzing (NEW) │
243+
│ ├─ ffuf (55) → Fast basic fuzzing │
244+
│ │ ├─ Directory discovery (500+ req/sec) │
245+
│ │ ├─ File fuzzing │
246+
│ │ ├─ Vhost detection │
247+
│ │ └─ Parameter mining │
248+
│ │ │
249+
│ └─ Custom (50) → Specialized attacks │
250+
│ ├─ HTTP Parameter Pollution │
251+
│ ├─ Type confusion detection │
252+
│ ├─ ML parameter prediction │
253+
│ └─ Framework-aware patterns │
254+
│ │
255+
│ Phase 4: Vulnerability Scanning │
256+
│ └─ (Future: nuclei integration) │
257+
│ │
258+
└────────────────────────────────────────────────────────┘
259+
```
260+
261+
### Performance Comparison
262+
263+
| Metric | Custom Fuzzer | ffuf | Improvement |
264+
|--------|---------------|------|-------------|
265+
| **Speed** | 50-100 req/s | 500-2000 req/s | **10-20x faster** |
266+
| **Test Coverage** | 0% | High (battle-tested) | **∞ more reliable** |
267+
| **Memory Usage** | High (unbounded buffers) | Low (optimized) | **5-10x more efficient** |
268+
| **Maintenance** | Medium-High | Low | **Less dev time** |
269+
| **Features** | Incomplete (3 TODOs) | Complete | **More features** |
270+
| **Maturity** | 0 years | 8+ years | **Battle-tested** |
271+
272+
### Cost-Benefit Analysis
273+
274+
**Option A: Fix Custom Fuzzer**
275+
- Development time: 36-60 hours
276+
- Risk: Medium (no production testing)
277+
- Performance: Suboptimal
278+
- Result: Still slower than ffuf
279+
280+
**Option B: Integrate ffuf (RECOMMENDED)**
281+
- Development time: 16-18 hours (62% less)
282+
- Risk: Very low (proven tool, 8.8k GitHub stars)
283+
- Performance: 10-20x faster
284+
- Result: Production-ready immediately
285+
286+
**ROI: Option B saves 20-40 hours AND delivers 10x better performance**
287+
288+
### Success Metrics
289+
290+
**Performance Goals**:
291+
- [ ] Directory fuzzing: 500+ req/sec (10x improvement)
292+
- [ ] Memory usage: <100MB for 10k wordlist (vs current unbounded)
293+
- [ ] Zero data races (vs current unsafe concurrency)
294+
295+
**Quality Goals**:
296+
- [ ] 100% interface compliance (vs current broken)
297+
- [ ] Test coverage >80% for integration layer
298+
- [ ] Zero TODOs in production code (vs current 3)
299+
300+
**User Experience Goals**:
301+
- [ ] Faster scans = better value for bug bounty hunters
302+
- [ ] Reliable results = increased trust
303+
- [ ] Industry-standard tool = familiar to security researchers
304+
305+
### Dependencies
306+
307+
**Required**:
308+
- ffuf binary installed or available in PATH
309+
- Go dependencies: `github.com/ffuf/ffuf`
310+
311+
**Optional**:
312+
- Wordlists in `/opt/shells/wordlists/` (will use defaults if missing)
313+
- Custom ffuf config in `.shells.yaml`
314+
315+
### Rollout Plan
316+
317+
**Phase 1: Soft Launch** (Week 1-2)
318+
- ffuf available via `shells fuzz --engine ffuf`
319+
- Custom fuzzer remains default
320+
- Gather user feedback
321+
322+
**Phase 2: Default Switch** (Week 3)
323+
- ffuf becomes default engine
324+
- Custom fuzzer available via `--engine custom`
325+
- Deprecation notice for custom engine
326+
327+
**Phase 3: Cleanup** (Week 4+)
328+
- Remove deprecated custom fuzzer code (80% reduction)
329+
- Keep only specialized attack modules
330+
- Archive custom fuzzer docs
331+
332+
### Risk Mitigation
333+
334+
**Risk 1: ffuf binary not available**
335+
- Mitigation: Hybrid CLI + fallback mode (same as subfinder)
336+
- Fallback: Return mock data for testing
337+
338+
**Risk 2: Performance expectations not met**
339+
- Mitigation: Benchmark before release
340+
- Fallback: Keep custom fuzzer available
341+
342+
**Risk 3: Feature gaps in ffuf**
343+
- Mitigation: Keep custom fuzzer for specialized attacks
344+
- Solution: Hybrid approach covers all use cases
345+
346+
### Files to Create/Modify
347+
348+
**New Files**:
349+
- `internal/discovery/ffuf_integration.go` - ffuf wrapper module
350+
- `internal/discovery/ffuf_integration_test.go` - integration tests
351+
- `pkg/fuzzing/hpp_detector.go` - Extract HPP detection
352+
- `pkg/fuzzing/type_confusion.go` - Extract type confusion
353+
- `pkg/fuzzing/ml_predictor.go` - Extract ML predictions
354+
355+
**Modified Files**:
356+
- `internal/discovery/engine.go` - Register ffuf module
357+
- `cmd/fuzz.go` - Update to use ffuf backend
358+
- `pkg/fuzzing/scanner.go` - Simplify to specialized attacks only
359+
- `.gitmodules` - Add ffuf submodule
360+
- `CLAUDE.md` - Document new architecture
361+
362+
**Deprecated Files** (Remove 80% of custom fuzzer):
363+
- `pkg/fuzzing/fuzzer.go` - 1,094 lines → 200 lines (keep utility functions)
364+
- `pkg/fuzzing/engines.go` - 476 lines → 0 (replaced by ffuf)
365+
- `pkg/fuzzing/advanced.go` - 780 lines → 300 lines (keep specialized only)
366+
367+
**Net Code Reduction**: ~1,850 lines removed, ~500 lines added = **1,350 lines less to maintain**
368+
369+
### Alignment with Shells Philosophy
370+
371+
**"Built by ethical hackers for ethical hackers"**
372+
- Use tools security researchers already trust (ffuf is industry-standard)
373+
374+
**"Evidence-based"**
375+
- Battle-tested tool over prototype (8+ years vs 0 years)
376+
377+
**"Value for time, value for money"**
378+
- 10x faster scans = more value per dollar spent on cloud infrastructure
379+
380+
**"Sustainable innovation"**
381+
- Lower maintenance burden = more time for new features
382+
383+
**"Maintainable code"**
384+
- 1,350 fewer lines to maintain, higher test coverage
385+
386+
---
387+
140388
## Executive Summary
141389

142-
**Current State**: Two execution paths (legacy Execute() + new Pipeline), need to merge
143-
**Overall Grade**: B (Good architecture, duplicate execution logic)
144-
**Estimated Total Timeline**: Week 1 (Merger) + 6.5 weeks (P0+P1+P2) ≈ **8 weeks total**
145-
**Note**: Phase 4 reduced from 15 days → 11 days after removing Phase 3 overlaps
390+
**Current State**: ProjectDiscovery tools integrated (✅ COMPLETE), fuzzing replacement planned
391+
**Overall Grade**: A- (Excellent tooling, needs fuzzing upgrade)
392+
**Recent Completions**:
393+
- ✅ ProjectDiscovery integration (subfinder, httpx, dnsx, tlsx, katana) - 1 day
394+
- ⏳ Fuzzing replacement with ffuf - 2-3 weeks planned
395+
396+
**Updated Timeline**:
397+
- Week 1 (Execution Merger) - ✅ COMPLETE
398+
- Week 2-3 (Fuzzing Replacement) - ⏳ PLANNED (16-18 hours)
399+
- Weeks 4-10 (P0+P1+P2 fixes) - Remaining work
400+
401+
**Estimated Total Timeline**: ~10 weeks total (8 weeks original + 2-3 weeks fuzzing)
146402

147403
### Critical Discovery (2025-10-30)
148404

UNIFIED_DATABASE_PLAN.md

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -319,6 +319,28 @@ WHERE severity IN ('CRITICAL', 'HIGH', 'MEDIUM', 'LOW', 'INFO');
319319
- Document severity normalization
320320
- Update examples to use lowercase
321321

322+
**Status**: ✅ **COMPLETE** (2025-10-30)
323+
324+
**Changes Applied**:
325+
-`workers/service/database.py` - Severity normalization implemented
326+
-`workers/tests/test_database.py` - 4 new unit tests added
327+
-`workers/migrate_severity_case.sql` - Migration script created
328+
-`workers/README.md` - Documentation updated with normalization section
329+
330+
**Test Results**:
331+
- ✅ test_save_finding_normalizes_severity_uppercase
332+
- ✅ test_save_finding_normalizes_severity_lowercase
333+
- ✅ test_save_finding_normalizes_severity_mixedcase
334+
- ✅ test_save_findings_batch_normalizes_severity
335+
336+
**Verification**:
337+
```bash
338+
# Run tests
339+
pytest workers/tests/test_database.py::TestDatabaseClient::test_save_finding_normalizes_severity_uppercase -v
340+
341+
# Result: PASSED ✅
342+
```
343+
322344
---
323345

324346
### Phase 2: Standardize Connection String Format (P1)

0 commit comments

Comments
 (0)