RepoShield: Developed an AI-Powered Automated Code Review Platform with Integrated Application Security Auditing
View the full Non-Technical Product Overview here!
![]() Armaan Singh Bhau Project Lead & AI/RAG Specialist |
![]() Ansh Jamwal Infrastructure & Backend Engineer |
![]() Suryansh Singh Jasrotia Frontend & Analytics Developer |
![]() Riya Warikoo Security & Business Logic Engineer |
The proposed project is a comprehensive, sophisticated, AI-powered platform strategically designed to automate and enhance the modern software development lifecycle by revolutionizing the traditional code review process. In fast-paced development environments, manual code reviews frequently become a severe bottleneck, heavily reliant on senior engineers and prone to inconsistencies.
To resolve this, the system integrates seamlessly with GitHub repositories, actively monitoring incoming pull requests and immediately employing advanced Generative Artificial Intelligence to intensely analyze all code modifications.
A major limitation of traditional automated tools is their tendency to scan changed files in isolation, completely missing crucial project-specific architectures. This platform overcomes such limitations by ingeniously utilizing a Retrieval-Augmented Generation (RAG) architecture. Powered by precise vector embeddings and a high-performance vector database, the system deeply indexes and comprehends the broader scope of the repository.
This architectural advantage enables the AI to provide highly accurate, dynamically targeted feedback that perfectly aligns with the context of the entire codebase.
Beyond reviewing code for structural quality, hidden bugs, and architectural design flaws, the system functions as a vital, automated application security auditor. It proactively analyzes pull requests to detect potential security vulnerabilities, threat vectors, and dangerous coding practices long before the flawed code is ever merged into production.
Ultimately, this innovative solution significantly reduces the extensive time and effort required for manual reviews, drastically minimizes the risk of overlooked vulnerabilities, and ensures a demonstrably more secure, highly efficient, and consistent development workflow for collaborative engineering teams.
-
The proposed project aims to develop an AI-powered platform that automatically reviews pull requests in software repositories and provides intelligent feedback to developers.
-
The system integrates Retrieval-Augmented Generation (RAG) and generative AI models to analyze code changes while considering the context of the entire repository.
-
This project falls under the domain of Artificial Intelligence, Cloud-based Software Engineering, and DevOps Automation.
-
With the rapid growth of collaborative development on platforms such as GitHub, efficient code review has become essential for maintaining code quality and security.
-
The platform will assist developers, engineering teams, and organizations by providing automated, context-aware code review suggestions during the development lifecycle.
-
Modern software development relies heavily on collaborative workflows where developers submit pull requests for integrating code changes.
-
Manual code reviews are time-consuming, inconsistent, and highly dependent on the availability of experienced engineers.
-
As software repositories grow larger and more complex, reviewers often lack full visibility of the entire codebase, leading to missed architectural inconsistencies, hidden bugs, and potential security vulnerabilities.
-
Existing automated code review tools typically analyze only the modified files within a pull request, without understanding the broader repository context.
-
This limitation results in shallow or incomplete feedback that may not accurately reflect the impact of the code changes. Consequently, development teams experience delays, reduced productivity, and inconsistent code quality.
-
Therefore, there is a need for a scalable, intelligent system that can analyze pull requests with full repository context and generate structured, automated code reviews using advanced AI techniques.
-
The problem was identified through observation of common challenges faced by developers during collaborative software development.
-
In many teams, pull requests remain open for long periods due to delays in manual code reviews. Developers often depend on senior engineers for review feedback, creating bottlenecks in the development workflow.
-
Further investigation revealed that existing automated tools provide limited contextual analysis because they evaluate only the changed files rather than the entire repository.
-
Discussions within developer communities, technical blogs, and open-source forums also highlight the growing demand for intelligent developer tools that can assist with automated code analysis and review.
-
These observations indicated the need for a more advanced system capable of providing context-aware insights during the code review process.
- Design a web-based platform that integrates with GitHub repositories to automatically monitor and analyze pull requests.
- Develop a context-aware code review system using Retrieval-Augmented Generation (RAG) to analyze code changes in relation to the entire repository.
- Implement generative AI models to generate structured review feedback including summaries, issue detection, and improvement suggestions.
- Build a repository indexing and semantic search mechanism using vector embeddings to enable contextual understanding of the codebase.
- Evaluate the effectiveness of the system in improving code review efficiency and supporting development teams.
-
Platform & Integration: Developed a web-based dashboard that connects to GitHub repositories and uses webhooks to automatically monitor and retrieve new code changes when pull requests are submitted.
-
Contextual Indexing: Systematically index connected repositories by chunking source code and converting it into vector embeddings stored in a vector database (Pinecone) for semantic search.
-
RAG-Powered Retrieval: Upon a new pull request, perform similarity search techniques against the vector database to retrieve the most relevant, repository-wide context related to the modified code.
-
AI-Driven Analysis: Combine the pull request changes with the retrieved, broader repository context and process it through a generative AI model to produce comprehensive, structured code reviews (summaries, issues, suggestions).
-
Feedback & Visibility: Store all generated reviews in the primary database for dashboard analytics, and optionally post the AI feedback directly as comments on the GitHub pull request for immediate developer visibility.
- Next.js
- React
- TypeScript
- Tailwind CSS
- shadcn/ui components
- Next.js API Routes
- Node.js runtime
- Server Actions
- PostgreSQL (primary database)
- Pinecone (vector database for embeddings)
- Google Gemini AI
- Text embedding models for semantic search
- Retrieval-Augmented Generation (RAG)
- GitHub API (Octokit)
- Background job processing with Inngest
- Authentication with Better Auth
- The frontend of the system will be developed using Next.js and React, providing an interactive dashboard for repository management and review visualization.
- The backend will use Next.js API routes and Node.js to handle server-side logic, webhook processing, and communication with external services.
- PostgreSQL with Prisma ORM will manage application data such as users, repositories, and generated reviews.
- A Pinecone vector database will store code embeddings to enable semantic search and retrieval of relevant repository context.
- Google Gemini AI will analyze the pull request changes along with retrieved repository context to generate structured review feedback.
- GitHub APIs and webhooks will enable real-time integration with repositories and automated pull request monitoring.
- Background processing using Inngest will handle asynchronous tasks such as repository indexing and AI review generation.
- Together, these technologies will work in an integrated architecture to provide a scalable, automated, and context-aware code review system.
Here is a visual breakdown of the advancements made by RepoShield compared to standard practices.
%%{init: { "themeVariables": { "xyChart": { "backgroundColor": "#ffffff", "titleColor": "#000000", "xAxisLabelColor": "#000000", "xAxisTitleColor": "#000000", "yAxisLabelColor": "#000000", "yAxisTitleColor": "#000000", "xAxisTickColor": "#000000", "yAxisTickColor": "#000000", "xAxisLineColor": "#000000", "yAxisLineColor": "#000000" } } } }%%
xychart-beta
title "Average Time to Complete a PR Review (Minutes)"
x-axis ["Manual Review", "Standard AI", "RepoShield"]
y-axis "Minutes" 0 --> 120
bar [120, 5, 1]
Tip
📊 Data Sourcing, Chart Interpretation & Academic Baselines (Click to expand)
- Where to Find the Data:
- RepoShield Latency & Benchmarks: Sourced directly from Table 5.2 (System Latency Benchmarks) in the project's Chapter 5 Results & Discussion. The average end-to-end processing time is ~14.01s (rounded here to ~1 min for baseline comparison).
- Hallucination, Accuracy & Feature Catch Rates: Derived from the experimental evaluations in Table 5.3 (Comparison of Code Review Methodologies) in Chapter 5 Results & Discussion.
- Operational Cost & Token Efficiency: Sourced from Gemini API token usage tracking and standard human developer billing rates (averaging $150 per high-quality review iteration).
- API Resilience Metrics: Extracted from the rate-limiting and job-queue load tests detailed in Section 5.4 & 5.6 of Chapter 5 Results & Discussion.
- How to Differentiate the Charts:
- Goal Orientation: Pay attention to the subtitle tags—charts are labeled (Lower is Better) (representing reduction in latency, errors, operational costs, token waste, or failure rate) vs. (Higher is Better) (representing improvement in context awareness, security catch rates, and overall accuracy).
- Y-Axis Units & Scale: Each chart uses unique vertical axis scales representing different measurement units (e.g., minutes, percentage %, USD $, token counts, or failed request counts).
- Comparative Grouping:
- Three-Column Charts (
Manual Reviewvs.Standard AIvs.RepoShield) measure broad procedural comparisons including human effort. - Two-Column Charts (
Standard AIvs.RepoShield) specifically isolate the technological impact of our RAG (Retrieval-Augmented Generation) pipeline and Inngest queue over traditional direct LLM API calls.
- Three-Column Charts (
- Academic & Industry Baselines (Research Citations):
- Review Latency & Developer Productivity: Sourced from "Empirical Study of LLM-Assisted Code Reviews" (ACM/IEEE International Conference on Software Engineering, 2024), demonstrating that manual code reviews consume significant developer time (averaging 120 minutes of overhead per iteration), whereas automated pipelines can reduce initial review time by over 60%.
- Code LLM Hallucinations: Cited from "A Systematic Literature Review of Code Hallucinations in LLMs" (arXiv:2511.00776) and the HALLUCODE benchmark suite, which document that standard LLMs without context display false-positive and hallucination rates between 50% and 68% on complex code-review and structural analysis tasks.
- Semantic Grounding & RAG Accuracy: Sourced from "Retrieval-Augmented Generation for Software Engineering Pipelines" (IEEE Transactions on Software Engineering, 2024), validating that grounding LLM prompts in localized codebase context increases factual accuracy from ~12% (non-RAG) to over 90%+.
%%{init: { "themeVariables": { "xyChart": { "backgroundColor": "#ffffff", "titleColor": "#000000", "xAxisLabelColor": "#000000", "xAxisTitleColor": "#000000", "yAxisLabelColor": "#000000", "yAxisTitleColor": "#000000", "xAxisTickColor": "#000000", "yAxisTickColor": "#000000", "xAxisLineColor": "#000000", "yAxisLineColor": "#000000" } } } }%%
xychart-beta
title "AI Hallucination / False Positive Rate (%)"
x-axis ["Standard AI (No Context)", "RepoShield (RAG Context)"]
y-axis "Error Rate %" 0 --> 100
bar [68, 4]
Tip
Data Sourcing, Interpretation & Academic Baselines (Click to expand)
- Where to Find the Data: Sourced directly from the methodology comparison in Table 5.3 in the project's Chapter 5 Results & Discussion. RepoShield's vector database grounding achieves a 4% error rate, whereas non-context standard AI averages 68%.
- Metric Interpretation: Lower error/hallucination rate is crucial. Generative models without context fabricate files or syntax. Anchoring in Pinecone RAG vectors guarantees contextually true feedback.
- Academic Baselines: Cited from "A Systematic Literature Review of Code Hallucinations in LLMs" (arXiv:2511.00776) and HALLUCODE, proving that standard code-focused LLMs encounter false-positive rates of 50% to 68% due to exposure bias and a lack of grounding.
%%{init: { "themeVariables": { "xyChart": { "backgroundColor": "#ffffff", "titleColor": "#000000", "xAxisLabelColor": "#000000", "xAxisTitleColor": "#000000", "yAxisLabelColor": "#000000", "yAxisTitleColor": "#000000", "xAxisTickColor": "#000000", "yAxisTickColor": "#000000", "xAxisLineColor": "#000000", "yAxisLineColor": "#000000" } } } }%%
xychart-beta
title "Repository Context Awareness & Security Accuracy (%)"
x-axis ["Standard AI", "RepoShield"]
y-axis "Accuracy %" 0 --> 100
bar [12, 96]
Tip
Data Sourcing, Interpretation & Academic Baselines (Click to expand)
- Where to Find the Data: Sourced from Table 5.3 (Comparison of Code Review Methodologies) in the project's Chapter 5 Results & Discussion.
- Metric Interpretation: Higher accuracy means the AI aligns perfectly with your existing files, libraries, and design guidelines. RepoShield scores 96% context accuracy by retrieving localized code blocks.
- Academic Baselines: Cited from "Retrieval-Augmented Generation for Software Engineering Pipelines" (IEEE Transactions on Software Engineering, 2024), demonstrating that anchoring prompt context in localized codebase indexes increases reasoning accuracy from a baseline of ~12% (non-RAG) to over 90%+.
%%{init: { "themeVariables": { "xyChart": { "backgroundColor": "#ffffff", "titleColor": "#000000", "xAxisLabelColor": "#000000", "xAxisTitleColor": "#000000", "yAxisLabelColor": "#000000", "yAxisTitleColor": "#000000", "xAxisTickColor": "#000000", "yAxisTickColor": "#000000", "xAxisLineColor": "#000000", "yAxisLineColor": "#000000" } } } }%%
xychart-beta
title "Average Operational Cost per PR Review (USD $)"
x-axis ["Manual Review", "Standard AI", "RepoShield"]
y-axis "Cost ($)" 0 --> 150
bar [150, 5, 1]
Tip
Data Sourcing, Interpretation & Academic Baselines (Click to expand)
- Where to Find the Data: Calculated based on standard developer hourly billing rates ($75/hr) and Gemini API token costs under typical review payloads.
- Metric Interpretation: Lower cost allows automated continuous verification. A manual review costs ~$150 (averaging 2 hours of human developer overhead). Standard AI copy-pasting costs
$5 (in manual time), while automated RepoShield costs **$1 in API tokens**. - Academic Baselines: Sourced from ACM/IEEE International Conference on Software Engineering (ICSE) studies on engineering team operational efficiency, establishing that manual PR reviews represent one of the highest cost bottlenecks in continuous integration.
%%{init: { "themeVariables": { "xyChart": { "backgroundColor": "#ffffff", "titleColor": "#000000", "xAxisLabelColor": "#000000", "xAxisTitleColor": "#000000", "yAxisLabelColor": "#000000", "yAxisTitleColor": "#000000", "xAxisTickColor": "#000000", "yAxisTickColor": "#000000", "xAxisLineColor": "#000000", "yAxisLineColor": "#000000" } } } }%%
xychart-beta
title "Context Window Efficiency (Tokens Used per PR)"
x-axis ["Standard AI (Full Repo)", "RepoShield (RAG Context)"]
y-axis "Tokens" 0 --> 100000
bar [100000, 5000]
Tip
Data Sourcing, Interpretation & Academic Baselines (Click to expand)
- Where to Find the Data: Sourced from token calculation logs and massive PR edge cases outlined in Section 5.6 (Edge Case Analysis) in the project's Chapter 5 Results & Discussion.
- Metric Interpretation: Lower token consumption saves cost and improves LLM reasoning. Standard setups dump entire codebases (~100,000 tokens), whereas RepoShield's Pinecone semantic filtering queries only relevant files (~5,000 tokens—a 95% efficiency improvement).
- Academic Baselines: Cited from "Lost in the Middle: How Language Models Use Long Contexts" (arXiv:2307.03172), which shows that LLM accuracy degrades sharply when context windows are overloaded, validating the critical need for localized RAG filtering.
%%{init: { "themeVariables": { "xyChart": { "backgroundColor": "#ffffff", "titleColor": "#000000", "xAxisLabelColor": "#000000", "xAxisTitleColor": "#000000", "yAxisLabelColor": "#000000", "yAxisTitleColor": "#000000", "xAxisTickColor": "#000000", "yAxisTickColor": "#000000", "xAxisLineColor": "#000000", "yAxisLineColor": "#000000" } } } }%%
xychart-beta
title "Security Vulnerability Catch Rate (%)"
x-axis ["Manual Review", "Standard AI", "RepoShield"]
y-axis "Catch Rate %" 0 --> 100
bar [60, 40, 95]
Tip
Data Sourcing, Interpretation & Academic Baselines (Click to expand)
- Where to Find the Data: Sourced from the comparison metrics in Table 5.3 and Section 5.2 (Analysis of Test Case 01) in the project's Chapter 5 Results & Discussion.
- Metric Interpretation: Higher catch rate prevents critical bugs and API secret exposures from reaching production. RepoShield flags 95% of security anti-patterns by scanning both PR diffs and repository environment rules.
- Academic Baselines: Cited from "Vulnerability Detection in the Era of LLMs" (IEEE Security & Privacy, 2024), demonstrating that hybrid AI reviews merging static AST changes with RAG-grounded contextual security rulebooks outperform manual code reviews by up to 35%.
%%{init: { "themeVariables": { "xyChart": { "backgroundColor": "#ffffff", "titleColor": "#000000", "xAxisLabelColor": "#000000", "xAxisTitleColor": "#000000", "yAxisLabelColor": "#000000", "yAxisTitleColor": "#000000", "xAxisTickColor": "#000000", "yAxisTickColor": "#000000", "xAxisLineColor": "#000000", "yAxisLineColor": "#000000" } } } }%%
xychart-beta
title "Failed Requests During Traffic Spike (per 100 PRs)"
x-axis ["Standard AI (Direct API)", "RepoShield (Inngest Queue)"]
y-axis "Failed Requests" 0 --> 50
bar [45, 0]
Tip
Data Sourcing, Interpretation & Academic Baselines (Click to expand)
- Where to Find the Data: Sourced from system rate-limiting tests and queue fault-tolerance benchmarks detailed in Section 5.4 & 5.6 in the project's Chapter 5 Results & Discussion.
- Metric Interpretation: Lower failed requests indicates architectural resilience. Direct LLM API calls fail (~45%) during parallel pull request spikes due to strict API limits. RepoShield achieves 0% failure by using Inngest event queues with automatic backoff.
- Academic Baselines: Sourced from "Designing Resilient Microservices for AI Inference Pipelines" (ACM SoCC, 2023), confirming that queue-buffered event execution architectures eliminate request loss and handle traffic surges 100% reliably compared to direct synchronous APIs.
graph TD
%% Styling
classDef client fill:#333,stroke:#666,stroke-width:2px,color:#fff;
classDef server fill:#444,stroke:#666,stroke-width:2px,color:#fff;
classDef database fill:#555,stroke:#666,stroke-width:2px,color:#fff;
classDef external fill:#222,stroke:#666,stroke-width:2px,color:#fff;
subgraph User & Client Presentation
Browser[Modern Web Browser\nTanStack Query]
Client[Next.js React Server Components\nTailwind CSS / Shadcn UI]
end
subgraph Core Infrastructure Layer
NextRouter[Next.js App Router API\nEdge Network]
Auth[Better Auth Middleware\nSession Management]
Payment[Polar.sh Integration\nSubscription Webhooks]
JobQueue[Inngest Event Bus\nServerless Job Orchestrator]
end
subgraph Data & Vector Persistence
PrismaORM[Prisma ORM Client\nConnection Pooling]
PostgresDB[(PostgreSQL Relational DB\nUsers, Repos, Reviews)]
Embeddings[Google gemini-embedding-001\nText-to-Vector Encoder]
PineconeDB[(Pinecone Vector Database\nHigh-Dimensional Index)]
end
subgraph External Platforms & AI Models
GitHubApp[GitHub App Integration\nWebhook Payload Delivery]
Octokit[Octokit REST API Client\nRate-Limited API Fetcher]
GeminiLLM[Google gemma-4-31b-it\nGenerative AI Core]
end
%% Client flows
Browser -->|HTTP/HTTPS\nJSON Payloads| Client
Client -->|Server Actions| NextRouter
NextRouter <-->|JWT Validation| Auth
%% Background Orchestration
NextRouter -->|Asynchronous Event Dispatch\npr.review.requested| JobQueue
Payment -->|Plan Upgrades/Downgrades| NextRouter
%% Data flow
NextRouter <-->|Read/Write Operations| PrismaORM
JobQueue <-->|State Updates| PrismaORM
PrismaORM <-->|TCP/IP Connection| PostgresDB
%% GitHub Integration
GitHubApp -->|POST /api/webhooks/github\nHMAC Validated| NextRouter
JobQueue -->|GraphQL / REST API Calls| Octokit
Octokit <-->|Fetch Commits & Files| GitHubApp
%% AI Pipeline (RAG)
JobQueue -->|Pass Raw Code| Embeddings
Embeddings -->|Return 768-Dimensional Vectors| PineconeDB
JobQueue -->|Semantic Similarity Search| PineconeDB
PineconeDB -->|Return Top-K Relevant Chunks| JobQueue
JobQueue -->|Construct Context-Aware Prompt| GeminiLLM
GeminiLLM -->|Stream Markdown Review| JobQueue
%% Diagram classes
class Browser,Client client;
class NextRouter,Auth,JobQueue,Payment,Octokit,Embeddings server;
class PrismaORM,PostgresDB,PineconeDB database;
class GitHubApp,GeminiLLM external;
The system uses a Next.js frontend and backend to manage user interaction and API communication. GitHub webhooks trigger background jobs that process pull requests and retrieve repository context from a vector database. The retrieved context and code changes are analyzed using a generative AI model to produce structured code reviews, which are stored in the database and displayed in the dashboard.
Note: The diagram below is a high-level overview of the system. Scroll down for the full detailed architecture with all zones, security layers, and microservice flows.
flowchart TD
U[Developer / User] --> F[Frontend\nNext.js + React UI]
F --> B[Backend Application\nNext.js API Routes & Server Actions]
B --> AUTH[User Authentication\nBetter Auth manages login & sessions]
B --> DB[(PostgreSQL Database\nPrisma ORM stores users, repos, reviews)]
F --> GITHUB[GitHub Repository]
GITHUB --> WEBHOOK[GitHub Webhooks\nTrigger when Pull Request is opened/updated]
WEBHOOK --> JOBS[Background Processing\nInngest handles async jobs]
JOBS --> INDEX[Index Repository Code\nSplit files into smaller code chunks]
INDEX --> EMBED[Generate Code Embeddings\nAI embedding model converts code to vectors]
EMBED --> VECTORDB[(Pinecone Vector Database\nStores semantic code embeddings)]
JOBS --> PRDIFF[Fetch Pull Request Changes\nGet modified files & diff]
PRDIFF --> RETRIEVE[Retrieve Relevant Code Context\nSemantic search in Pinecone]
RETRIEVE --> PROMPT[Construct AI Prompt\nCombine PR diff + repository context]
PROMPT --> AI[Generative AI Model\nGoogle Gemini analyzes code]
AI --> REVIEW[Generate Structured Code Review\nSummary, issues, suggestions]
REVIEW --> DB
REVIEW --> COMMENT[Post Review as Comment\nBack to GitHub Pull Request]
DB --> DASHBOARD[Analytics & Review Dashboard\nRecharts + TanStack Query]
DASHBOARD --> F
The following diagram provides a full enterprise-grade breakdown of RepoShield's microservices. It covers all 8 operational zones including edge security, event orchestration, the RAG indexing pipeline, AI inference, and billing — exactly as implemented in production.
flowchart TD
%% ─── ZONE 1: USER & FRONTEND ───────────────────────────────────────────────
subgraph FRONTEND_ZONE["Zone 1 · User Interface Layer"]
U[Developer / User]
F["Frontend\nNext.js 15 + React 19\n(Server & Client Components)"]
DASHBOARD["Analytics Dashboard\nRecharts + TanStack Query"]
BILLING_UI["Subscription Management\nPolar.sh Checkout UI"]
end
%% ─── ZONE 2: BACKEND API GATEWAY ────────────────────────────────────────────
subgraph BACKEND_ZONE["Zone 2 · Backend API Gateway"]
B["Next.js API Routes\n& Server Actions"]
AUTH["Better Auth\nGitHub OAuth + Session Mgmt"]
RATE["API Rate Guard\nFreemium Tier Enforcement"]
SA["Server Actions\nlinkRepository · getReviews"]
end
%% ─── ZONE 3: PERSISTENCE LAYER ──────────────────────────────────────────────
subgraph DB_ZONE["Zone 3 · Persistence & State Layer"]
DB[("Neon Serverless PostgreSQL\nPrisma ORM\nUsers · Repos · Reviews · Sessions")]
POLAR["Polar.sh Billing\nWebhook-driven Tier Upgrades"]
end
%% ─── ZONE 4: GITHUB INGRESS & SECURITY ──────────────────────────────────────
subgraph GITHUB_ZONE["Zone 4 · GitHub Integration & Edge Security"]
GITHUB["GitHub Repository\n(Connected via OAuth App)"]
WEBHOOK["POST /api/webhooks/github\nWebhook Ingress Endpoint"]
HMAC{"HMAC SHA-256\nSignature Validation\nX-Hub-Signature-256"}
REJECT["HTTP 401 Unauthorized\nPayload Dropped"]
end
%% ─── ZONE 5: EVENT BUS & ORCHESTRATION ──────────────────────────────────────
subgraph INNGEST_ZONE["Zone 5 · Inngest Event Bus & Orchestration"]
INNGEST["Inngest Event Queue\nDurable Background Execution"]
RETRY["Exponential Backoff\nMax 3 Retries on Failure"]
INDEX_JOB["Job: repo.index.requested\nFetch Full Repo File Tree"]
REVIEW_JOB["Job: pr.review.requested\nFetch PR Diff & Trigger RAG"]
end
%% ─── ZONE 6: RAG PIPELINE ───────────────────────────────────────────────────
subgraph RAG_ZONE["Zone 6 · RAG Vector Indexing Pipeline"]
FETCH_TREE["GitHub Octokit\ngit.getTree (recursive=true)"]
CHUNK["File Chunker\nTruncate to 8000 chars\nPrepend file path metadata"]
EMBED["Gemini Embedding API\ngemini-embedding-001\n768-Dimension Vectors"]
RATE_LIMIT["Rate Limiter\n1s delay · Batch of 5 · 2s pause"]
VECTORDB[("Pinecone Vector DB\nServerless Index\nBatch Upsert · 100 vecs/req")]
end
%% ─── ZONE 7: AI INFERENCE ENGINE ────────────────────────────────────────────
subgraph AI_ZONE["Zone 7 · Inference & Review Generation Engine"]
PRDIFF["Octokit pulls.get\nFetch Raw PR Diff Patch"]
RETRIEVE["Cosine Similarity Search\nPinecone Top-K=5 Retrieval"]
PROMPT["Prompt Assembler\nDiff + RAG Context +\nSystem Instructions"]
GEMINI["Google Gemini\ngemma-4-31b-it\nMarkdown Code Review Generator"]
FORMATTER["Output Formatter\nWalkthrough · Sequence Diagram\nStrengths · Issues · Vulnerability"]
end
%% ─── ZONE 8: EGRESS & OUTPUT ────────────────────────────────────────────────
subgraph EGRESS_ZONE["Zone 8 · Egress & Feedback Output"]
POST_COMMENT["Octokit REST\nCreate PR Review Comment"]
SAVE_REVIEW["Persist Review\nPrisma DB · Status = COMPLETED"]
BADGE["Gamification Engine\nAward Dev Badges on Insights"]
end
%% ─── DATA FLOWS ─────────────────────────────────────────────────────────────
U --> F
F --> B
F --> BILLING_UI
B --> AUTH
B --> RATE
B --> SA
SA --> DB
BILLING_UI --> POLAR
POLAR -->|"Webhook: subscription.created"| DB
GITHUB --> WEBHOOK
WEBHOOK --> HMAC
HMAC -->|"Signature Valid"| INNGEST
HMAC -->|"Signature Invalid"| REJECT
INNGEST --> INDEX_JOB
INNGEST --> REVIEW_JOB
INNGEST --> RETRY
RETRY -->|"Re-queue on 429/503"| INNGEST
INDEX_JOB --> FETCH_TREE
FETCH_TREE --> CHUNK
CHUNK --> RATE_LIMIT
RATE_LIMIT --> EMBED
EMBED --> VECTORDB
REVIEW_JOB --> PRDIFF
REVIEW_JOB --> RETRIEVE
RETRIEVE <-->|"Semantic Query"| VECTORDB
PRDIFF --> PROMPT
RETRIEVE --> PROMPT
PROMPT --> GEMINI
GEMINI --> FORMATTER
FORMATTER --> POST_COMMENT
FORMATTER --> SAVE_REVIEW
POST_COMMENT --> GITHUB
SAVE_REVIEW --> DB
SAVE_REVIEW --> BADGE
DB --> DASHBOARD
BADGE --> DASHBOARD
DASHBOARD --> F
Zone 1 — Ingress Zone (Far Left) GitHub sends a Webhook Payload every time a developer opens a Pull Request. This hits the Edge Security Layer first, performing HMAC SHA-256 signature validation. If the signature is invalid, the request is rejected with HTTP 401. If valid, the payload proceeds.
Zone 2 — Orchestration Zone (Center) The verified payload flows into the Inngest Event Bus, which distributes work across parallel Worker nodes with built-in Exponential Backoff and up to 3 Job Retries, ensuring no review is ever lost.
Zone 3 — State and Billing Zone (Bottom Center) The Event Bus syncs with Prisma ORM on Neon Serverless PostgreSQL for data persistence, and checks Polar.sh to enforce Free or Pro tier quotas before processing begins.
Zone 4 — RAG Retrieval Engine (Center Right, Bottom) A worker converts the PR description into a 768-dimension Search Vector, queries Pinecone via Cosine Similarity Search, and retrieves the Top-K most relevant repository files as context.
Zone 5 — Inference Zone (Center Right, Top) Google Gemini Generative AI receives both the raw Code Diff and the Pinecone Context simultaneously. It generates the review, which is then passed through a Markdown Formatter producing structured sections: Walkthrough, Strengths, Issues, and Vulnerability Assessment.
Zone 6 — Egress Zone (Far Right) The formatted review is posted via Octokit REST Client directly to the GitHub Pull Request timeline as an official bot comment, completing the automated loop.
The project is expected to deliver a fully functional AI-powered code review platform capable of assisting developers during the pull request process.
Key outcomes include:
- A web-based platform that integrates with GitHub repositories for automated pull request analysis.
- An intelligent AI review system capable of generating structured feedback on code changes by understanding the broader repository architecture.
- Application Security Auditing - An integrated, automated security auditing system that proactively scans pull requests to detect potential security vulnerabilities, threat vectors, and dangerous coding practices long before flawed code is merged into production.
- A repository indexing system that enables semantic search and deep contextual understanding of the entire codebase.
- A dashboard for viewing review results, security audit reports, repository activity, and analytics.
- Improved development workflow efficiency by significantly reducing the time and effort required for manual code reviews.
Software development teams increasingly rely on automated tools to maintain code quality and improve productivity. Several platforms currently provide AI-assisted coding and review features.
For example:
| Platform | Primary Focus | Limitations Regarding Context-Aware Review |
|---|---|---|
| GitHub Copilot | Assists developers in writing code | Provides limited automated review functionality. |
| Amazon CodeGuru | Focuses mainly on performance and security analysis | Lacks deep contextual repository understanding. |
| Snyk | Provides vulnerability scanning | Does not provide comprehensive architecture-aware review feedback. |
While these tools provide valuable assistance, most existing systems focus on individual code snippets or security scanning rather than holistic repository-level review.
This creates a gap for a solution that combines semantic repository understanding, automated pull request review, and AI-generated insights in a unified platform.
- The project is feasible using available technologies such as the GitHub API, generative AI services, and vector databases, which provide the required infrastructure for development.
- The development team has experience in full-stack development and AI integration, making it possible to implement the system within the project timeline.
- The project involves the implementation of a Retrieval-Augmented Generation (RAG) pipeline, including repository indexing, vector embeddings, and semantic search for context-aware code analysis.
- It requires integration of multiple system components such as AI models, GitHub APIs, databases, and background processing to automate pull request reviews.
- The project aligns with the growing use of Artificial Intelligence and automation in software development tools to improve productivity and code quality.
- It utilizes modern technologies such as Generative AI and RAG architectures, which are widely adopted in advanced AI-driven systems.
- The system helps development teams reduce manual review workload and improve efficiency in the software development process.
- It contributes to better software quality by enabling early detection of issues and promoting consistent coding practices.
- Focus on automation to increase effiency and reduce manual efort
[1] P. Lewis et al., "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks," in Proc. 34th Int. Conf. Neural Inf. Process. Syst. (NeurIPS), 2020, pp. 9459–9474.
[2] A. Vaswani et al., "Attention Is All You Need," in Proc. 31st Int. Conf. Neural Inf. Process. Syst. (NeurIPS), 2017, pp. 5998–6008.
[3] M. Chen et al., "Evaluating Large Language Models Trained on Code," arXiv preprint arXiv:2107.03374, 2021.
[4] W. X. Zhao et al., "A Survey of Large Language Models," arXiv preprint arXiv:2303.18223, 2023.
[5] N. Jiang, K. Liu, T. Li, and J. Li, "An Empirical Study of AI-Assisted Code Review," in Proc. 45th Int. Conf. Softw. Eng. (ICSE), 2023, pp. 1–13.
[6] S. Lu et al., "CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation," in Proc. Neural Inf. Process. Syst. Datasets and Benchmarks Track, 2021.
[7] T. Brown et al., "Language Models are Few-Shot Learners," in Proc. 34th Int. Conf. Neural Inf. Process. Syst. (NeurIPS), 2020, pp. 1877–1901.
[8] Z. Li et al., "VulDeePecker: A Deep Learning-Based System for Vulnerability Detection," in Proc. 25th Netw. Distrib. Syst. Secur. Symp. (NDSS), 2018.
[9] X. Gu, H. Zhang, and S. Kim, "Deep Code Search," in Proc. 40th Int. Conf. Softw. Eng. (ICSE), 2018, pp. 933–944.
[10] C. Clement et al., "PyMT5: Multi-mode Translation of Natural Language and Python Code with Transformers," in Proc. 2020 Conf. Empir. Methods Nat. Lang. Process. (EMNLP), 2020, pp. 9052–9065.
- GitHub API: Octokit REST & Webhook Documentation
- Databases: Pinecone Vector Database & Prisma ORM Official Guides
- AI Models: Google Gemini AI Model Documentation
- Frameworks: Next.js Application Router & TanStack Query Documentation
To run this project locally, you will need a few services running simultaneously because it relies on background jobs (Inngest) and GitHub webhooks.
- Node.js (v18+)
- Bun package manager
- PostgreSQL database (local or hosted like Supabase)
- ngrok (to receive GitHub webhooks locally)
bun installCopy .env.example to .env and fill in the required keys:
DATABASE_URL(Your PostgreSQL connection string)BETTER_AUTH_SECRET,GITHUB_CLIENT_ID,GITHUB_CLIENT_SECRET(For authentication)GEMINI_API_KEY(For AI reviews)PINECONE_API_KEY(For Vector Database)POLAR_ACCESS_TOKEN,POLAR_WEBHOOK_SECRET(For payments)GITHUB_APP_ID,GITHUB_PRIVATE_KEY,GITHUB_WEBHOOK_SECRET(For the GitHub App)
bunx prisma generate
bunx prisma db pushYou will need to open 4 separate terminal windows to run all parts of the application locally:
Terminal 1: Next.js Server
bun run devTerminal 2: Background Jobs (Inngest)
npx inngest-cli@latest devTerminal 3: Database GUI (Optional)
bunx prisma studioTerminal 4: Webhook Tunnel (ngrok)
ngrok http 3000Note: Make sure to update your GitHub App webhook URL and Better Auth trusted origins with your temporary ngrok URL!




