This project is a simple search engine developed during the Introductory Programming course at the IT University of Copenhagen. It implements core search engine functionalities including web page storage, query processing, and result ranking.
- WebPage Management: Stores and manages web pages with titles, URLs, and content.
- Search Engine: Handles search requests, supporting multi-word queries and "OR" operators.
- Inverted Index: Efficiently maps words to the web pages containing them for fast search results.
- Result Ranking: Uses TF and TF-IDF algorithms to rank search results by relevance.
- Validation: Filters out invalid or incomplete web page entries during input processing.
- Java (OOP principles)
- Java Streams and Collections for data handling
- Basic HTML/JavaScript frontend for search interface
src/main/java/searchengine/– Main source code implementing all tasks.group34.zip– Complete project code.
Amanda Cunha, Johannes Hackl, Grant Johnson, Pauli H.