Skip to content

GSoC 2026: Add sample dataset for noise filtering (Module B)#840

Open
saikrishna-b-dev wants to merge 2 commits intoOWASP:mainfrom
saikrishna-b-dev:gsoc-noise-filter
Open

GSoC 2026: Add sample dataset for noise filtering (Module B)#840
saikrishna-b-dev wants to merge 2 commits intoOWASP:mainfrom
saikrishna-b-dev:gsoc-noise-filter

Conversation

@saikrishna-b-dev
Copy link
Copy Markdown

This PR adds a sample dataset for OpenCRE Module B (Noise/Relevance Filter).

It includes examples of:

  • Security-relevant commits
  • Noise commits

This helps demonstrate filtering logic and dataset structure.

Looking forward to feedback.

@saikrishna-b-dev
Copy link
Copy Markdown
Author

Updated implementation to include:

  • Scoring-based classification
  • Hybrid LLM validation
  • Structured dataset with reasoning

Looking forward to feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant