Search-Engine 🔍

A simple search engine built in Python to understand how search engines work internally.

At a high level, a search engine does three things:

Crawls pages (collects content)
Indexes pages (builds data structures for fast lookup)
Searches (answers user queries with ranked results)

What’s inside

Crawler: Fetches web pages and extracts outgoing links to discover more pages.
Indexer: Converts crawled content into an inverted index for fast search.
Query Engine: Tokenizes the query, finds matching documents, ranks them, and returns results.
Web UI: A minimal Flask UI to enter queries and view results.

Project Structure

Search-Engine/
├── .github/workflows/black.yml
├── src/
│   └── search_engine/
│       ├── models/
│       │   ├── PageModel.py
│       │   └── TokenType.py
│       ├── utils/
│       │   ├── loggers.py
│       │   ├── parse_html.py
│       │   ├── requests.py
│       │   ├── string_utils.py
│       │   └── variables.py
│       ├── crawler.py
│       ├── indexer.py
│       └── query_response.py
├── templates/
│   └── index.html
├── app.py
├── pyproject.toml
├── uv.lock
└── README.md

You do not need to update any of the project structure to start the search engine.

Dependencies & Package Management (uv)

This project uses uv as the Python package manager.

All dependencies are declared in pyproject.toml and locked in uv.lock.

Install dependencies

From the project root, run:

uv sync

Running the Search Engine

After installing dependencies, start the backend server using:

uv run python app.py

The application will be available at

http://127.0.0.1:5000

How the System Works

A background asyncio event loop is created
The crawler starts discovering web pages
The indexer builds an inverted index
Flask serves HTTP requests
Queries are executed against the in-memory index

Crawling, indexing, and searching run concurrently.

Configuration (CommonVariables)

src/search_engine/utils/variables.py

CI / Code Quality

GitHub Actions enforces code formatting using Black.

Workflow location:

.github/workflows/black.yml

To run formatting locally:

uv run black .

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Search-Engine 🔍

What’s inside

Project Structure

Dependencies & Package Management (uv)

Install dependencies

Running the Search Engine

The application will be available at

How the System Works

Configuration (CommonVariables)

CI / Code Quality

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 102 Commits
.github/workflows		.github/workflows
src/search_engine		src/search_engine
templates		templates
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
app.py		app.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

priyanshum143/Search_Engine

Folders and files

Latest commit

History

Repository files navigation

Search-Engine 🔍

What’s inside

Project Structure

Dependencies & Package Management (uv)

Install dependencies

Running the Search Engine

The application will be available at

How the System Works

Configuration (CommonVariables)

CI / Code Quality

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages