UpCode ♻️

Overview

UpCode is a context-optimized, AI-powered tool designed to facilitate the transition from legacy codebases to modern languages. Leveraging Streamlit for its user interface and the Groq API for rapid LLM inference, the application processes full Java or COBOL projects, extracts architectural relationships, and applies a feedback-driven modernization loop to generate, validate, and refine the translated code.

Currently supports:

Java to Python 3
COBOL to Go

This tool moves beyond simple file-by-file text replacement by understanding your project's architecture, performing topological sorts to translate dependencies first, and injecting context into LLM prompts.

Core Features

Repository Ingestion: Upload a .zip archive or paste a GitHub URL containing your legacy source code.
Architectural Analysis: The engine parses .java, .cbl, or .cpy files to map out deep relationships including Inheritance, Interfaces, Imports, COPY books, and CALLs.
Topological Sort: Computes a Directed Acyclic Graph (DAG) to figure out exactly what order to translate the files in. Leaf nodes (dependencies) are translated first.
Context-Aware Translation: When translating a core file, it injects the already-translated code of its dependencies directly into the LLM prompt, avoiding context dilution and hallucinations. Powered by LangChain and the Groq API (meta-llama/llama-4-scout-17b-16e-instruct).
Self-Healing Validation: Generated code is automatically run through ast (Python) or gofmt (Go) checkers. If the LLM hallucinates a syntax error, the engine catches it, builds a refinement prompt with the error trace, and forces the LLM to fix it.
Instant Packaging: The translated results are instantly compiled back into a beautiful .zip archive matching the original project structure.
Modern Web UI: A seamless multipage Streamlit interface using st.navigation for an intuitive user experience.

Architecture and Components

The engine's backend is modularly structured under the core/ directory:

Ingestion (core/ingestion/): Responsible for fetching the code from ZIP or GitHub, identifying file types, and building the initial dependency graphs.
Translation (core/translation/): Manages the topological ordering, constructs context from previously translated dependencies, and interfaces with the LLM.
Validation (core/validation/): Contains strict syntactical validators (python_validator.py, go_validator.py) to verify the LLM's output.
Refinement (core/refinement/): Orchestrates the feedback loop. If validation fails, it queries the LLM again with the specific error messages until it succeeds or exhausts retries.
Output (core/output/): Reconstructs the original folder hierarchy with the newly translated files and packages them into a ZIP archive.

Prerequisites

Python 3.8 or higher
Valid Groq API Key
Local Go installation (gofmt must be available in your system path if you intend to translate COBOL to Go).

Installation

Clone the repository:

git clone https://github.com/AbhisumatK/Legacy-Code-Parser.git
cd Legacy-Code-Parser

Install the required dependencies:
```
pip install -r requirements.txt
```
Set up your API Key: You will need a free API key from Groq to power the AI translation engine.
- Go to the GroqCloud Console.
- Log in or create an account.
- Navigate to the API Keys section and generate a new key.
Add your generated API Key to .streamlit/secrets.toml:
```
GROQ_API_KEY="your_api_key_here"
```

Usage

Start the Streamlit application:
```
streamlit run app.py
```
Open the provided local URL in your web browser.
Navigate to either Translate Java or Translate COBOL using the sidebar.
Fetch Source: Upload your .zip or insert a GitHub URL, and click "Start Ingestion Engine".
Translate: Review the topological translation order and click "Run Translation Engine". The app will display progress as it translates each unit.
Download: Once finished, download the structured .zip archive containing your modernized codebase.

Limitations

The effectiveness of the translation is heavily dependent on the chosen LLM model.
Certain highly specific legacy paradigms (like complex multi-threading models, deep reflection, or obscure COBOL pointer arithmetic) might require manual review post-translation.
Circular dependencies are detected, but the system will fall back to a "best-effort" ordering which might lack perfect context injection.

Contributing

Contributions are welcome. Please ensure that structural changes to the graph generation or translation pipelines are robust, and that new logic is covered under the core/validation/ module.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Old_Prototype		Old_Prototype
core		core
pages		pages
.gitignore		.gitignore
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

UpCode ♻️

Overview

Core Features

Architecture and Components

Prerequisites

Installation

Usage

Limitations

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

UpCode ♻️

Overview

Core Features

Architecture and Components

Prerequisites

Installation

Usage

Limitations

Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages