This project presents an exploratory data analysis of Netflix Movies and TV Shows using SQL. The primary goal is to derive meaningful insights by solving real-world analytical business questions related to content distribution, ratings, geography, duration, and descriptive attributes.
The project demonstrates practical SQL skills including aggregation, string manipulation, window functions, filtering, and data transformation techniques.
- Analyze the distribution of content types (Movies vs TV Shows)
- Identify the most common ratings for movies and TV shows
- Examine content trends across release years and countries
- Explore genre distribution and keyword-based categorization
- Solve business-oriented analytical questions using SQL
The dataset used in this analysis is publicly available on Kaggle:
Dataset Link: https://www.kaggle.com/datasets/shivamb/netflix-shows
DROP TABLE IF EXISTS netflix;
CREATE TABLE netflix
(
show_id VARCHAR(5),
type VARCHAR(10),
title VARCHAR(250),
director VARCHAR(550),
casts VARCHAR(1050),
country VARCHAR(550),
date_added VARCHAR(55),
release_year INT,
rating VARCHAR(15),
duration VARCHAR(15),
listed_in VARCHAR(250),
description VARCHAR(550)
);Determine the distribution of content types on Netflix.
Identify the most frequently occurring rating for each type of content.
Retrieve movies released in a chosen year.
Analyze country-wise contribution to Netflix content.
Detect the movie with the maximum duration.
Analyze recently added Netflix content.
Retrieve content directed by a given director.
Identify long-running series.
Perform genre-wise content distribution analysis.
Analyze yearly content releases from India.
Filter documentary content.
Identify missing metadata records.
Analyze actor appearances over recent years.
Identify actors with highest appearances in Indian content.
Categorize content based on description keywords.
- Netflix hosts a diverse mix of Movies and TV Shows across regions
- Content ratings indicate varied audience targeting strategies
- Certain countries dominate content production and availability
- Genre analysis highlights platform content diversity
- Keyword-based classification helps understand thematic patterns
- SQL Query Writing
- Aggregations & Grouping
- Window Functions
- String Manipulation
- Data Cleaning & Transformation
- Analytical Thinking
- Business Problem Solving
- Dataset (CSV)
- Schema creation script
- Business problem statements
- SQL solution queries
- Project documentation
This project is created for learning, practice, and portfolio demonstration purposes using publicly available datasets.
Tejas
If you found this project useful or interesting, feel free to explore more repositories and connect.
