Skip to content

zabbonat/ADM-HW-3

Repository files navigation

Homework 3 - What movie to watch tonight?

Group 4

Authors:

  • Anna Presciuttini
  • Diletta Abbonato
  • Mario Dhimitri

Data collection

Data were extracted from three html files:

  • movies1.html
  • movies2.html
  • movies3.html

Each of these files contains 10,000 movies

The content of the repository is

  • README.md: a Markdown file that explains the content of our repository
  • collector.py: a python file that contains the line of code needed to collect our data from the html page
  • collector_utils.py: a python file that stores the function we used in collector.py.
  • parser.py: a python file that contains the line of code needed to parse the entire collection of html pages and save those in tsv files.
  • parser_utils.py: a python file that gathers the function we used in parser.py.
  • index.py: a python file that once executed generate the indexes of the Search engines.
  • index_utils.py: a python file that contains the functions we used for creating indexes.
  • main.ipynb: a python file that once executed build up the search engine.
  • exercise_4.py: python file that contains the implementation of the algorithm that, given a sequence, finds the length of the longest palindromic subsequence

Resume

We decided to adopt a group work strategy since we saw each other every day to discuss and jointly carry out the different points of the homework. The work was then carried out together by all the members of our team, sharing the previous skills of each of us.

About

adm hw3

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors