Skip to content

Conversation

@aradhyacp
Copy link

Description

This Pull Request improves the tool’s robustness by adding defensive error handling around zlib decompression and introduces a new feature to inspect PyInstaller archive metadata without performing a full extraction. Additionally, a .gitignore file has been introduced to maintain a clean repository state.

Changes Made

  1. Safe zlib decompression (Bug Fix)

    • Wraps zlib.decompress in a try/except block to handle corrupted archive data gracefully.
    • Prevents the extractor from crashing on malformed CArchive or PYZ entries.
    • Behavior change: Instead of a hard crash, users now receive a specific error message and extraction continues for the remaining valid entries
      [!] Error: Failed to decompress CArchive entry <entry_name> at offset <offset>
      
      Users now receive a clear error message and extraction continues for other entries.
  2. New Feature: --info / -i command

    • Adds a new CLI argument -i/--info that prints metadata about the PyInstaller archive without extracting files.
    • Implemented a printInfo() method in the PyInstArchive class.
    • Provides quick insights into:
      • PyInstaller version
      • Python version used
      • Number of files and their types
      • Presence of PYZ archives or encrypted entries
      • Top 5 largest files
    • This feature is especially useful for debugging and inspecting archives before attempting extraction.
    • Integrated with argparse to allow users to run
    • Usage
      • python pyinstxtractor-ng.py -i <file> or python pyinstxtractor-ng.py --info <file> .
  3. Maintenance

  • Added .gitignore to exclude common build artifacts and extraction directories.

Motivation & Context

While analyzing a specific binary, the script crashed due to a zlib.error on a corrupted entry. This made it impossible to extract the remaining valid files or inspect the archive.

Before (Crash):
image

By catching this exception, the tool becomes more resilient and useful for reverse engineering potentially malformed or obfuscated binaries.

Furthermore, the addition of the --info flag allows users to quickly determine the Python version or PyInstaller version of a target executable without writing thousands of files to the disk. This is particularly useful for quick triage.

Testing & Verification

  • Verified Crash Fix: Ran the tool against the binary that previously caused the zlib crash. The tool now prints [!] Error: Failed to decompress CArchive entry... and successfully proceeds to extract the rest of the file contents.
    • After (Successful recovery):
image
  • Verified -i functionality: Ran the tool with the --info flag on a valid PyInstaller executable. The output correctly listed the Python version, entry points, and file statistics.
    • Output is attached below

Related Issues

No specific issue linked.

Additional Notes

Note to Maintainers: This is my first Pull Request to this repository (and my first open-source contribution!). I have done my best to follow the existing code style. If there are any errors or better ways to implement these changes, I would appreciate your feedback and guidance.


Example Output of --info or -i

========== PyInstaller Archive Info ==========
[+] File: test
[+] PyInstaller version: 21
[+] Python version: 3.12
[+] PyInstaller generation: 2.1+
[+] CArchive files: 54
[+] PYZ archive present: Yes
[+] Encrypted: No
[+] Packages: 0, Python scripts: 3
[+] Entry points: pyiboot01_bootstrap, pyi_rth_inspect, test
[+] Top 5 largest files:
    Python.framework/Versions/3.12/Python       7,821,968 bytes
    libcrypto.3.dylib                           3,581,824 bytes
    base_library.zip                            1,332,005 bytes
    PYZ.pyz                                     1,265,978 bytes
    python3.12/lib-dynload/unicodedata.cpython-312-darwin.so    1,186,624 bytes
[+] Total compressed size:   7,135,234 bytes
[+] Total uncompressed size: 19,682,508 bytes
==============================================

@aradhyacp
Copy link
Author

@extremecoders-re any updates :) ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant