Skip to content

abdullahAttiq/case-study-docTracer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ChainProof — Hyperledger Fabric document vault with PKI signing and tripleSource integrity proof

A multiTenant SaaS that combines AWS S3 Object Lock (immutable file storage), MongoDB (operational state), and Hyperledger Fabric (tamperEvident ledger) into a single document platform. Every upload is hashed with SHA256; the hash, signatures, transfers, and permission changes are pinned to a permissioned blockchain ledger. Verification reHashes the S3 object and crossChecks all three sources — flagging any mismatch as tampering. Output: courtDefensible PDF + ZIP evidence bundles.

Stack · Hyperledger Fabric 2.x chaincode (JavaScript) · CouchDB rich queries · Private Data Collections · fabricContractApi · fabricGateway · Node.js + Express · S3 Object Lock · MongoDB · Stripe · pdfkit Domain · Permissioned blockchain · Document integrity · Legal compliance · MultiTenant SaaS · PKI signatures Status · Chaincode in this repo · Backend / SPAs / Fabric ops scripts private


Hero

flowchart LR
    User([Tenant User])
    BE[OffChain backend<br/>Express + Mongo]
    S3[(S3 Object Lock<br/>WORM bucket)]
    Mongo[(MongoDB<br/>operational state)]
    Fabric[(Hyperledger Fabric<br/>permissioned ledger<br/>+ PDC for PII)]

    User -- "upload doc" --> BE
    BE -- "PUT (immutable)" --> S3
    BE -- "metadata" --> Mongo
    BE -- "recordDocumentHash<br/>composite key DOC_<tenant>_<doc>" --> Fabric

    User -- "verify doc" --> BE
    BE -- "GET + reHash" --> S3
    BE -- "lookup record" --> Mongo
    BE -- "queryDocumentHash" --> Fabric
    BE -- "all 3 hashes match?" --> User
Loading

At a glance

  • Problem · Legal and compliance teams need document storage where neither admins nor cloud providers can silently alter files or audit history, and where signed documents will survive court scrutiny. Centralized appendOnly logs aren't sufficient — they require trusting the operator.
  • Approach · Three independent sources of truth: AWS S3 Object Lock (file content, WORM), MongoDB (operational metadata), Hyperledger Fabric (cryptographic proofs in a permissioned ledger with multiOrg consensus). Verification reHashes the file and crossChecks all three. Tamper from any single source is detectable.
  • Stack · Hyperledger Fabric 2.x chaincode in JavaScript on fabricContractApi, CouchDB state DB for rich queries, Private Data Collections for PII. OffChain: Node.js + Express, MongoDB/Mongoose, Socket.IO, Stripe billing, multer uploads, pdfkit + archiver for evidence bundles.
  • Status · Chaincode + Private Data Collections config in this repo. Backend services, two React SPAs (admin + tenant), Fabric ops scripts, and TSA integration are private.

Why this exists

Standard document storage (S3, GCS, even appendOnly Postgres) all share one failure mode: a privileged operator can rewrite history. For most use cases this doesn't matter. For legal evidence, signed contracts, regulatory submissions, courtOrdered preservation, and compliance audit trails, it matters a lot — the chain of custody must be defensible without trusting the operator.

ChainProof inverts the trust model. The offChain backend handles all the business logic (upload, sign, transfer, version, billing) but the cryptographic proofs of every operation are written to a permissioned Hyperledger Fabric ledger jointly operated by multiple Fabric organizations. No single operator can rewrite history without colluding with peers in another org, and even then, the public block hash chain would expose the rewrite.

Three architectural primitives work together:

  1. S3 Object Lock in compliance mode makes file content immutable for a retention period — even AWS root cannot delete or modify the object during the lock.
  2. Hyperledger Fabric records the SHA256 hash, signer cert hashes, transfer events, and permission changes. The ledger is appendOnly by Fabric's design plus chaincodeLevel enforcement (existence checks before putState).
  3. MongoDB holds operational state (UI lookups, search, plan limits, billing). Treated as untrusted — the "primary" sources are S3 + Fabric.

A document is "valid" only if all three sources agree on its hash. Any singleSource tamper trips the verification.


How it works

flowchart TD
    A[User uploads file via tenant SPA]
    B[Backend: SHA256 hash file]
    C[Backend: PUT to S3 with Object Lock retention]
    D[Backend: store metadata in MongoDB]
    E[Backend: chaincode recordDocumentHash<br/>key: DOCTenantDoc]
    F[Fabric peers endorse, order, commit]
    G[Backend: emit Socket.IO blockchainConfirmed]
    H[User: see green checkmark]

    I[Later: User clicks Verify]
    J[Backend: GET object from S3, reHash]
    K[Backend: lookup MongoDB record]
    L[Backend: queryDocumentHash from Fabric]
    M{3 hashes match?}
    N[Verified: green]
    O[Tampered: red, show which source disagrees]

    A --> B --> C --> D --> E --> F --> G --> H
    I --> J & K & L --> M
    M -->|yes| N
    M -->|no| O
Loading

Three highLeverage details:

  • Composite key schema enforces immutability without a global lock. Every record on Fabric has a key like DOC_{tenantId}_{documentId}. Before putState, the chaincode checks getState(key) is empty — if not, the call reverts. So a "second write" to the same document is impossible at the chaincode level. techStack.md §1.
  • Private Data Collections protect PII while keeping the proof public. Signature material (signer email, public key PEM, signature hex) is sensitive — it identifies a real human signing a document. PDCs store this on a subset of peers (per a collection policy) while the public hash is on the main ledger. Verifiers can prove a signature exists without revealing who signed. techStack.md §6.
  • TripleSource verification surfaces tamper at the source. The offChain verifyDocument() function returns one of: valid, s3HashMismatch, mongoRecordMissing, fabricRecordMissing, fabricHashMismatch. The user sees the specific failure mode, not just "invalid". deepDive.md.

Standout engineering

  • CompositeKey immutability at the chaincode level — DOC_/SIG_/MSIG_/XFER_/PERMS_ with existenceCheck before write. See contracts/chainproof.js.
  • CouchDB rich queries via getQueryResult for tenantScoped lookups (queryDocumentsByTenant, queryDocumentSignatures) — Fabric supports this when CouchDB is the state database, dramatically reducing the offChain indexer surface.
  • Fabric history via the builtIn getHistoryForKey — every key has its full transaction history exposed as an iterator, no separate indexer needed.
  • Private Data Collections with OR('Org1MSP.member', 'Org2MSP.member') policy — see contracts/collectionsConfig.json. Signer PII never hits the public block.
  • Transient data pattern for PDC writes — PII passes through ctx.stub.getTransient().get('privateData') instead of regular function args, ensuring it never enters the public transaction body or block payload.
  • AppendOnly permissions logPERMS_{tenantId}_{Date.now()} keys mean every permission change has a unique key, building an immutable change log without overwriting prior records.

Repository layout

contracts/
├── chainproof.js              # The full Hyperledger Fabric chaincode
└── collectionsConfig.json    # Private Data Collections policy (signature + transfer PDCs)

docs/
└── flows/
    ├── verification.md        # TripleSource verify sequence
    └── signature.md           # PKI signing + PDC write sequence

architecture.md                # System topology + components + Fabric topology
techStack.md                  # Deep What/Why/When/How/Alternatives report
deepDive.md                   # The tripleSource verification story

A note on source code

The chaincode in contracts/chainproof.js is the Hyperledger Fabric "smart contract" equivalent — it's deployed to a permissioned network and is readable to all member organizations. Identifiers have been renamed; logic is unchanged.

The offChain backend (Node.js + Express, MongoDB/Mongoose, S3 Object Lock client, fabricGateway client, PKI service, multer uploads, pdfkit evidenceReport generator, archiver ZIP bundler, Stripe webhooks, two React/Vite SPAs) is private. Architecture and decisions documented; happy to walk through the source in interviews.


Built by Hafiz Abdullah · hafiz.abdullah641@gmail.com · Open to interviews under NDA.

About

Engineering case study: Hyperledger Fabric document vault with PKI signing and triple-source integrity proof (S3 Object Lock + MongoDB + Fabric).

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors