This repository contains the code we wrote during Rock the JVM's Spark Essentials with Scala (Udemy version here). Unless explicitly mentioned, the code in this repository is exactly what was caught on camera.
- Java: JDK 17 or 21 (recommend Eclipse Temurin 21)
- Docker: Docker Desktop (Mac/Windows) or Docker Engine (Linux)
- IDE: IntelliJ IDEA with the Scala plugin
- Windows-specific: See HadoopWindowsUserSetup.md or use WSL 2 (recommended)
Note: Spark 4.x requires Java 17 or 21. Java 8 and 11 are no longer supported.
- Install Docker
- Install JDK 17 or 21 (Eclipse Temurin recommended)
- Either clone the repo or download as zip
- Open with IntelliJ as an SBT project
- Windows users: you need to set up some Hadoop-related configs — use this guide
- In a terminal window, navigate to the folder where you downloaded this repo and run
docker compose upto start PostgreSQL and the Spark cluster - To spin up multiple Spark workers:
docker compose up --scale spark-worker=3 - Access the Spark Master UI at http://localhost:9090
docker exec -it postgres psql -U docker -d rtjvm# Open a shell on the Spark master
docker exec -it spark-master bash
# Launch the Spark shell
/opt/spark/bin/spark-shell --master spark://spark-master:7077
# Launch the Spark SQL shell
/opt/spark/bin/spark-sql --master spark://spark-master:7077- Build the JAR:
sbt package(or use IntelliJ Build Artifacts) - Copy the JAR and any data files to
spark-apps/ - Submit from inside the master container:
docker exec -it spark-master /opt/spark/bin/spark-submit \
--class part6practical.TestDeployApp \
--master spark://spark-master:7077 \
--deploy-mode client \
/opt/spark-apps/spark-essentials_2.13-0.3.jar /opt/spark-apps/movies.json /opt/spark-apps/goodMoviesClone this repository and checkout the start tag by running the following in the repo folder:
git checkout start
Udemy students: checkout the udemy branch of the repo:
git checkout udemy
Rock the JVM students: checkout the master branch:
git checkout master
If you have changes to suggest to this repo, either
- submit a GitHub issue
- tell me in the course Q/A forum
- submit a pull request!