Skip to content

Fix broken spark-3.5-base-hadoop3.2.dockerfile#503

Merged
abhisheknath2011 merged 2 commits intolinkedin:mainfrom
dushyantk1509:dushyantk1509/fix-spark-3.5-docker-build
Mar 19, 2026
Merged

Fix broken spark-3.5-base-hadoop3.2.dockerfile#503
abhisheknath2011 merged 2 commits intolinkedin:mainfrom
dushyantk1509:dushyantk1509/fix-spark-3.5-docker-build

Conversation

@dushyantk1509
Copy link
Contributor

Problem

The spark-3.5-base-hadoop3.2.dockerfile builder stage was broken and prevented the oh-hadoop-spark Docker Compose recipe from building and making changes similar to #354 help fixing this.

Testing Done

End-to-end spark-shell test: table creation, data insert, and orphan file deletion job runs successfully.

- Replace deprecated openjdk:11.0.11-jdk-slim-buster (Debian Buster EOL,
  apt repos return 404) with eclipse-temurin:11-jdk-jammy (Ubuntu Jammy LTS)
- Add missing unzip package required for Livy assembly extraction
- Add --no-install-recommends and apt cleanup to reduce image size
- Fix Maven download URL from dlcdn.apache.org to archive.apache.org
  (dlcdn returns 404 for older Maven versions like 3.9.4)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@dushyantk1509 dushyantk1509 marked this pull request as ready for review March 18, 2026 08:03
Without --master spark://spark-master:7077, spark-shell defaults to
local[*] which may cause Spark actions that scan HDFS to hang.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Member

@abhisheknath2011 abhisheknath2011 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix.

@abhisheknath2011 abhisheknath2011 merged commit ac49434 into linkedin:main Mar 19, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants