A powerful AI-powered tool for exploring Theory of Computation concepts, including Regular Languages, Automata, and Regex. This project bridges the gap between abstract formal language theory and practical visualization using modern web technologies and Generative AI.
- Python 3.8+: The primary programming language.
- Streamlit: A react-like framework for building data-driven web apps in pure Python. Used for the entire frontend and state management.
- Automata-lib: A Python library for simulating and converting finite automata (DFA, NFA) and Regular Expressions.
- Google Gemini API (Generative AI): Used for the "Language Definition" feature, allowing the system to understand natural language descriptions (e.g., "Strings ending in 101") and convert them to formal Regex.
- Graphviz: An open-source graph visualization software. Used here to render state transition diagrams.
- Pandas: Used for handling transition tables (DataFrames) and batch testing results.
-
Finite Automata (FA):
- DFA (Deterministic Finite Automata): Every state has exactly one transition for each symbol.
-
NFA (Non-Deterministic Finite Automata): States can have zero, one, or multiple transitions for a symbol, including
$\epsilon$ -transitions.
-
Automata Conversions:
-
Subset Construction Algorithm (NFA
$\to$ DFA):- The project implements a manual subset construction algorithm.
- It computes the epsilon-closure for each state.
- It tracks new states as sets of NFA states (e.g.,
{q0, q1}) to explicitly show how the DFA is built, rather than renaming them to arbitrary integers.
-
Regex
$\to$ DFA:- Implements a chained conversion pipeline: Regex
$\to$ NFA$\to$ DFA. - This combines the parsing power of the library with the custom subset construction logic to ensure educational clarity in the final output.
- Implements a chained conversion pipeline: Regex
- Kleene's Theorem: Implementation of conversions between Regular Expressions and Finite Automata.
-
Subset Construction Algorithm (NFA
-
DFA Minimization (Moore's Algorithm):
- The project implements an
$O(n^2)$ algorithm to minimize DFAs by finding groups of equivalent states. - It iteratively refines partitions (0-equivalence, 1-equivalence, etc.) until no further splitting is possible.
- The project implements an
Role: The entry point and main controller of the application.
- Logic:
- Session State Management: Uses
st.session_stateto persist data (automata objects, analysis results, step logs) across browser reruns. This ensures the user's workflow is preserved during interaction. - Tab-Based Layout: Organizes the UI into "Define & Test Language" (AI + Batch Testing) and "Automata Studio" (Structural Operations).
- Security & API Priority: Implements a secure priority system for loading the Google API Key:
st.secrets(Production/Cloud)os.environ(Local Environment)- User Input (Sidebar Fallback)
- Note: If a system key is found, the manual input field is completely hidden from the UI to prevent accidental exposure.
- Context-Aware Examples: The "Load Example" button dynamically loads different datasets based on the target operation (e.g., loading an 8-state DFA specifically designed for Moore's Algorithm when "Minimized DFA" is selected).
- Session State Management: Uses
Role: Handles all deterministic operations and visualization. This file contains the core algorithmic implementations.
-
Logic:
-
nfa_to_dfa(Manual Subset Construction):- Instead of using the library's default conversion, this function manually implements the Powerset Construction algorithm.
- It calculates the epsilon-closure of states and iteratively finds reachable subsets.
-
Goal: To preserve the educational value by showing states labeled as
{q0, q1}.
-
minimize_dfa_with_steps(Moore's Algorithm):- Implements Moore's Algorithm for DFA minimization.
- Iteratively calculates
$k$ -equivalence partitions (0-equiv, 1-equiv...) and logs each step for the user. -
Partial DFA Support: Constructs the final DFA with
allow_partial=Trueto handle cases where transitions are missing (implicit dead states), avoiding validation errors without adding confusing "Sink" states to the diagram.
-
regex_to_dfa:- Parses the Regex into an NFA using the library, then passes it through the custom
nfa_to_dfalogic to ensure the final DFA retains readable subset states.
- Parses the Regex into an NFA using the library, then passes it through the custom
-
get_graphviz_source(Custom Visualization):- A custom rendering engine that generates DOT source code directly.
-
Sanitization: Detects complex state labels (like
frozenset({'q0', 'q1'})or['q0', 'q1']) and converts them into clean set notation ({q0, q1}). - Avoids heavy dependencies like
pygraphviz, ensuring compatibility with Streamlit Cloud.
-
Role: Manages communication with Google's Gemini API.
- Logic:
- Dynamic Model Resolution: Implements
_resolve_model_nameto fix common "404 Not Found" errors by checking available models at runtime (list_models) and prioritizinggemini-1.5-flash. - Structured Outputs: Uses
typing.TypedDictandresponse_schemato force the AI to return strict JSON matching defined schemas (LanguageAnalysis,StringCheck). This eliminates parsing errors common with free-form LLM responses. - Lazy Configuration: The
configure_apimethod allows the API key to be set at runtime (via the UI) rather than requiring it at startup.
- Dynamic Model Resolution: Implements
Role: The bridge between the UI and the AI Handler.
- Logic:
LanguageProcessor: A class that encapsulates the state of the "current language".- Orchestration: It takes the raw string from the UI, sends it to
AIHandler, validates the JSON response, and stores the resulting Regex/Explanation for use in testing. - Regex Validation: It uses Python's
remodule to test strings against the AI-generated regex to provide deterministic "Accepted/Rejected" results.
.streamlit/secrets.toml: Stores sensitive data (API Keys). This file is git-ignored to prevent leaks..streamlit/config.toml: Configures Streamlit server settings (e.g., headless mode).secrets.toml.example: A template file showing users how to configure their own API keys.requirements.txt: Lists Python packages (automata-lib,streamlit,google-generativeai).packages.txt: Lists system-level binaries (graphviz) required by the deployment environment.main.py: The legacy CLI version of the tool.test_inputs.csv: Sample data for the Batch Testing feature.test_model_resolution.py: Unit test script used to verify the AI model selection logic.
- Prerequisites: Python 3.8+
- Install Python Dependencies:
pip install -r requirements.txt
- Install System Dependencies (for visualization):
- This project uses Graphviz. Ensure it is installed on your system (e.g.,
apt-get install graphvizon Linux, or download from the official site for Windows/Mac).
- This project uses Graphviz. Ensure it is installed on your system (e.g.,
- Run the Application:
streamlit run app.py
The application requires a Google Gemini API Key for the "Language Definition" features.
- Create a folder
.streamlitin the root. - Create
secrets.toml:GOOGLE_API_KEY = "your_key_here"
If no system key is found, the app will prompt you to enter one in the sidebar.