A Kotlin library for extracting metadata from RAR4, RAR5, and 7zip archives.
Important: This library does not perform archive extraction or compression. It only reads archive headers to extract file listings, byte offsets, sizes, and split/multi-volume information. This makes it suitable for scenarios where you need to inspect archive contents without extracting them, such as building file indexes or streaming specific files from remote archives.
- RAR 4.x - Full metadata extraction including multi-volume/split archive support
- RAR 5.x - Full metadata extraction including multi-volume/split archive support
- 7zip - Metadata extraction with calculated byte offsets (accurate for uncompressed/store-mode archives)
- Java 25+
- Kotlin 2.3+
- Gradle 9+
Add to your build.gradle.kts:
repositories {
mavenCentral()
maven("https://jitpack.io")
}
dependencies {
implementation("com.github.skjaere:kotlin-compression-utils:v0.1.0")
}ArchiveService is the top-level entry point. It auto-detects the archive type (RAR4, RAR5, 7z) from filenames or byte signatures and dispatches to the appropriate parser. It also supports PAR2-based obfuscated filename resolution.
// From a single file
val entries = ArchiveService.listFiles("/path/to/archive.rar")
// From concatenated multi-volume streams with volume metadata
val volumes = listOf(
VolumeMetaData(filename = "archive.part1.rar", size = vol1.length()),
VolumeMetaData(filename = "archive.part2.rar", size = vol2.length()),
)
val entries = ArchiveService.listFiles(stream, volumes)
// With PAR2 data for obfuscated filenames
val entries = ArchiveService.listFiles(stream, volumes, par2Data = par2File.readBytes())Returns List<ArchiveFileEntry> (either RarFileEntry or SevenZipFileEntry).
val bytes = file.readBytes().sliceArray(0 until 32)
val result = ArchiveTypeDetector.detect(bytes)
println(result.type) // RAR4, RAR5, SEVENZIP, or UNKNOWN
println(result.isFirstVolume) // true if this is the first volume in a multi-volume setval service = RarArchiveService()
// From a file path
val entries = service.listFiles("/path/to/archive.rar")
// From an InputStream
val entries = service.listFiles(inputStream)
// From a SeekableInputStream
val entries = service.listFiles(seekableStream)
entries.forEach { entry ->
println("${entry.path} - ${entry.uncompressedSize} bytes")
println(" compressed: ${entry.compressedSize}, method: ${entry.compressionMethod}")
println(" uncompressed: ${entry.isUncompressed}, split: ${entry.isSplit}")
}val service = RarArchiveService()
// Provide a concatenated stream of all volumes in order
val entries = service.listFilesFromConcatenatedStream(
stream = concatenatedStream,
totalArchiveSize = totalSize,
volumeSizes = listOf(vol1Size, vol2Size, vol3Size) // enables split position optimization
)
// Split files include position info for each volume
entries.filter { it.isSplit }.forEach { entry ->
entry.splitParts.forEach { part ->
println(" Volume ${part.volumeIndex}: offset=${part.dataStartPosition}, size=${part.dataSize}")
}
}val service = SevenZipArchiveService()
val entries = service.listFiles("/path/to/archive.7z")
entries.forEach { entry ->
println("${entry.path} - ${entry.size} bytes at offset ${entry.dataOffset}")
}The library provides a SeekableInputStream interface for efficient random-access reading:
// Wrap a file
val stream = FileSeekableInputStream(RandomAccessFile(file, "r"))
// Wrap a regular InputStream (forward-only seeking)
val stream = BufferedSeekableInputStream(inputStream)
// Implement for HTTP byte-range requests
class MyHttpStream(url: String, size: Long) : HttpRangeSeekableInputStream(url, size) {
override fun fetchRange(fromPosition: Long, toPosition: Long): ByteArray {
// Make HTTP GET with Range header
}
}The validateArchive Gradle task verifies that the library's parsed byte offsets are correct by comparing CRC32 checksums against ground truth from the 7z CLI.
7z(p7zip) must be installed and onPATHpar2(optional) -- used for PAR2 verify/repair if.par2files are present alongside the archive
- Discovers all sibling volumes from the given first-volume path
- Runs PAR2 verify/repair if a
.par2index file is found in the same directory - Runs
7z l -sltto get ground-truth file CRCs - Parses the archive with the library to obtain byte offsets
- Reads raw data from those offsets, computes CRC32, and compares against ground truth
- Checks that all files reported by
7zwere also found by the library - Reports PASS/FAIL/SKIP per file (compressed files are skipped)
# Single RAR5 archive
./gradlew validateArchive -ParchivePath=path/to/archive.rar
# Multi-volume RAR5
./gradlew validateArchive -ParchivePath=path/to/archive.part1.rar
# Single 7z archive
./gradlew validateArchive -ParchivePath=path/to/archive.7z
# Split 7z
./gradlew validateArchive -ParchivePath=path/to/archive.7z.001The task exits with code 1 if any file fails validation.
| Class | Description |
|---|---|
ArchiveService |
Unified entry point -- auto-detects format and dispatches to the appropriate parser |
RarArchiveService |
Parses RAR4/RAR5 archives and returns file metadata |
SevenZipArchiveService |
Parses 7zip archives and returns file metadata |
ArchiveTypeDetector |
Detects archive format from magic bytes |
VolumeMetaData |
Describes an archive volume (filename, size, optional first-16KB for PAR2 resolution) |
RarFileEntry |
Metadata for a file in a RAR archive |
SevenZipFileEntry |
Metadata for a file in a 7zip archive |
SplitInfo |
Byte position info for split/multi-volume files |
SeekableInputStream |
Interface for streams supporting random access |