Skip to content

debjitmitra000/DOCER

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

46 Commits
 
 
 
 
 
 

Repository files navigation

DOCER: AI-Powered Document Formatting Platform

DOCER is a full-stack MERN application that provides a comprehensive suite of tools for creating, formatting, and exporting professional documents with AI-assisted capabilities.

It offers a seamless experience for users to transform unstructured text into beautifully formatted documents, build custom layouts from scratch, or use pre-defined templates for rapid creation.


Table of Contents

  1. Core Features
  2. Tech Stack
  3. System Architecture & Workflows
  4. Key Packages & Design Choices
  5. Project Structure
  6. Data Models & ER Diagram
  7. API Endpoints
  8. Security Measures
  9. Local Setup & Installation
  10. Future Work

1. Core Features

DOCER provides three distinct modes for document creation, catering to different user needs.

✨ Pre-built Templates

Users can select from a library of standard, professionally designed templates like Resumes, Business Letters, and Reports.

  • Guided Forms: Each template presents a structured form, making it easy to fill in the required information.
  • Live Preview: The Preview.jsx component renders a real-time preview of the document as the user types into the form fields in Editor.jsx.
  • Consistent Exports: The structure and alignment defined in the template are consistently applied to both PDF and DOCX exports.

🎨 Custom Template Builder

A flexible, canvas-like editor for users who need complete creative control over their document layout.

  • Drag & Drop Interface: Built with dnd-kit, the DragDropBuilder.jsx component allows users to add, reorder, and position sections freely.
  • Component-Based Elements: Users can add and style individual textBlocks and shapes (rectangles, circles) within each section.
  • Live Canvas: The CustomPreview.jsx component renders a live, interactive canvas where elements can be moved, resized, and styled.
  • Auto-Save to localStorage: Protects user work by automatically saving drafts to the browser, preventing data loss on refresh.
  • Persistent Storage: Finalized templates are saved to MongoDB, storing the exact positions, sizes, and styles of all elements.
  • Collaborative Locking: Support for multiple users editing templates with automatic locking mechanisms to prevent concurrent edit conflicts.

🤖 AI Document Builder

Leverages Google's Gemini AI to automatically format unstructured text into professional documents.

  • Text/File Input: Users can paste raw text or upload a file (.txt, .md) via the UI in Upload.jsx.
  • Intelligent Parsing: The backend service (aiDocumentService.js) sends the content to the Gemini API with a detailed system prompt, instructing it to identify the document type, extract key information, and return a structured JSON layout.
  • Automated Formatting: The AI applies professional layout rules, such as centering resume headers, right-aligning letter closings, and using consistent typography.
  • Refinement & Export: The generated document can be fine-tuned in the AIDocumentEditor.jsx and then exported to PDF or DOCX.

2. Tech Stack

Category Technology / Package
Frontend React 18, Vite, Tailwind CSS, React Router, Axios, dnd-kit
Backend Node.js, Express, MongoDB, Mongoose
AI & Export @google/generative-ai (Gemini 1.5 Flash), pdf-lib, docx
Security Helmet, CORS, jsonwebtoken, bcryptjs, zod
DevOps Multer (file uploads), Morgan (HTTP logging), dotenv (environment variables)

3. System Architecture & Workflows

High-Level Architecture

DOCER follows a standard MERN stack architecture with two independent applications: a React client and a Node.js/Express server.

graph TD
    subgraph "Client (Browser)"
        A[React UI Components]
        B[Axios API Client]
        C[React Router]
        D[localStorage State]
    end

    subgraph "Server Node.js"
        E[Express Server]
        F[API Routes]
        G[Auth Middleware]
        H[Services]
        I[Mongoose Models]
    end

    subgraph "External Services"
        J[MongoDB Atlas]
        K[Google Gemini AI]
    end

    A -->|Renders| C
    A -->|API Calls| B
    B -->|HTTP Requests| E
    E -->|Routes to| F
    F -->|Protected by| G
    G -->|Passes to| F
    F -->|Uses| H
    H -->|CRUD| I
    H -->|AI Calls| K
    I -->|Persists| J
    D -->|Auto-save| A

    style A fill:#61DAFB,stroke:#333,stroke-width:2px
    style E fill:#8CC84B,stroke:#333,stroke-width:2px
    style J fill:#4DB33D,stroke:#333,stroke-width:2px
    style K fill:#4285F4,stroke:#333,stroke-width:2px
Loading

Authentication Flow (JWT)

User authentication is handled via JSON Web Tokens (JWT), ensuring that all sensitive operations are secure and user-scoped.

sequenceDiagram
    participant User
    participant Client
    participant Server
    participant Database

    User->>Client: Enters credentials (email, password)
    Client->>Server: POST /api/auth/login
    Server->>Database: Query user by email
    Database-->>Server: Returns user document
    Server->>Server: Validates password with bcrypt compare
    alt Credentials Valid
        Server->>Server: Generates JWT with user ID payload
        Server-->>Client: Returns { token, user }
        Client->>Client: Stores token in localStorage
        Note over Client: Axios interceptor now attaches token<br/>to all subsequent API requests
    else Credentials Invalid
        Server-->>Client: Returns 401 Unauthorized error
    end
Loading

AI Document Generation Flow

This workflow outlines how unstructured text is transformed into a formatted document using the Gemini AI service.

sequenceDiagram
    participant User
    participant Client
    participant Server
    participant GeminiAI
    participant Database

    User->>Client: Pastes text or uploads file
    Client->>Server: POST /api/ai-document/generate with text
    Server->>Server: Cleans and prepares text
    Server->>GeminiAI: Sends text with detailed system prompt
    GeminiAI-->>Server: Returns structured JSON (sections, styles, etc.)
    Server->>Server: Validates and enhances AI response
    Server->>Database: Saves new AIDocument to MongoDB
    Database-->>Server: Returns saved document object
    Server-->>Client: Returns the generated document object
    Client->>User: Renders document in AIDocumentEditor
Loading

Custom Template Builder Workflow

sequenceDiagram
    participant User
    participant Client
    participant localStorage
    participant Server
    participant Database

    User->>Client: Opens/creates custom template
    loop Every change
        Client->>Client: Update component state
        Client->>localStorage: Auto-save draft (debounced)
    end
    
    Note over User,localStorage: User makes edits, positions elements
    
    User->>Client: Clicks "Save Template"
    Client->>Server: PUT /api/custom-templates/:id with final state
    Server->>Database: Updates template document
    Database-->>Server: Confirmation
    Server-->>Client: Returns updated template
    Client->>User: Shows success notification
Loading

4. Key Packages & Design Choices

PDF Generation (pdf-lib): Chosen over headless browser solutions like Puppeteer to avoid heavy binary dependencies, making the application more portable and faster to deploy. It provides fine-grained control for rendering elements at exact coordinates, ensuring high fidelity between the preview and the final PDF.

DOCX Generation (docx): Selected for its lightweight nature and robust support for creating standard Word documents. It prioritizes semantic structure (paragraphs, headings, alignment) over absolute positioning, which is not well-supported in the DOCX format.

Custom Builder State (localStorage): The DragDropBuilder component uses localStorage for real-time auto-saving. This is a deliberate choice to protect users from losing work due to accidental refreshes, without spamming the database with every minor change. The final state is committed to MongoDB only when the user explicitly clicks "Save".

AI Service (@google/generative-ai): Utilizes the gemini-1.5-flash model for its balance of speed, cost, and capability. A comprehensive system prompt in aiDocumentService.js guides the model to produce reliable, structured JSON, which is then validated and cleaned by the server to ensure data integrity.

Drag & Drop (dnd-kit): Provides a lightweight, accessible drag-and-drop solution without heavy dependencies. Offers smooth interactions for reordering and positioning elements in the custom template builder.


5. Project Structure

The project is a monorepo with two independent applications: client and server.

/
├── client/
│   ├── public/
│   ├── src/
│   │   ├── api/
│   │   │   └── client.js              # Axios instance with JWT interceptor
│   │   ├── components/
│   │   │   ├── DragDropBuilder.jsx    # Core component for custom templates
│   │   │   ├── CustomPreview.jsx      # Renders free-form custom layouts
│   │   │   ├── Preview.jsx            # Renders pre-built template previews
│   │   │   ├── RichTextEditor.jsx     # Text styling component
│   │   │   └── ShapeToolbar.jsx       # Tools for adding shapes
│   │   ├── pages/
│   │   │   ├── Builder.jsx            # Custom template creation page
│   │   │   ├── Editor.jsx             # Pre-built template editor page
│   │   │   ├── Upload.jsx             # AI document creation page
│   │   │   ├── AIDocumentEditor.jsx   # Editor for AI-generated documents
│   │   │   ├── Dashboard.jsx          # Home/recent templates page
│   │   │   ├── MyTemplates.jsx        # User's custom templates
│   │   │   ├── MyDocuments.jsx        # User's saved documents
│   │   │   ├── Login.jsx              # Authentication page
│   │   │   └── Signup.jsx             # Registration page
│   │   ├── App.jsx                    # Main component with React Router
│   │   ├── main.jsx                   # Application entry point
│   │   └── index.css                  # Tailwind CSS + design tokens
│   ├── .env.example
│   └── package.json
│
└── server/
    ├── src/
    │   ├── models/
    │   │   ├── User.js                # User schema
    │   │   ├── Template.js            # Template schema
    │   │   ├── Document.js            # Document schema
    │   │   ├── AIDocument.js          # AI Document schema
    │   │   └── CustomTemplate.js      # Custom Template schema with locking
    │   ├── routes/
    │   │   ├── auth.js                # Authentication endpoints
    │   │   ├── templates.js           # Template CRUD endpoints
    │   │   ├── documents.js           # Document CRUD endpoints
    │   │   ├── aiDocument.js          # AI document endpoints
    │   │   ├── exports.js             # PDF/DOCX export endpoints
    │   │   ├── upload.js              # File upload endpoints
    │   │   └── enhance.js             # AI enhancement endpoints
    │   ├── services/
    │   │   ├── aiDocumentService.js   # Gemini API interaction logic
    │   │   ├── aiEnhanceService.js    # Field enhancement logic
    │   │   ├── exportService.js       # PDF and DOCX generation
    │   │   └── lockService.js         # Template locking for concurrent edits
    │   ├── utils/
    │   │   ├── auth.js                # JWT middleware (authRequired, authOptional)
    │   │   └── cloudinaryHelper.js    # Cloud storage utilities
    │   ├── seed/
    │   │   └── seedTemplates.js       # Seed prebuilt templates
    │   └── index.js                   # Express server entry point
    ├── .env.example
    └── package.json

6. Data Models & ER Diagram

Mongoose Schemas

User: Stores user credentials and basic information.

  • _id: ObjectId (unique identifier)
  • email: String (unique, required)
  • password: String (bcrypt hashed)
  • name: String (required)
  • createdAt: Date (auto-generated)

Template: Base template schema for pre-built and custom templates.

  • name: String (required)
  • type: String enum ("resume" | "letter" | "report" | "custom")
  • sections: Array of section objects with textBlocks and shapes
  • styles: Global design tokens (page settings, typography, colors)
  • createdBy: ObjectId ref to User (null for prebuilt)
  • createdAt: Date

Document: Saved user documents based on templates.

  • title: String
  • templateId: ObjectId ref to Template
  • content: Object (flexible structure for user-entered data)
  • stylesOverride: Optional per-section style overrides
  • userId: ObjectId ref to User
  • pdfUrl: String (URL to stored PDF)
  • createdAt: Date

AIDocument: Documents generated by Gemini AI.

  • name: String
  • documentType: String (auto-detected by AI)
  • sourceText: String (original input)
  • sections: Array of AI-generated section objects with positioning
  • styles: Global styles applied by AI
  • userId: ObjectId ref to User
  • createdAt: Date

CustomTemplate: Enhanced template with collaborative features.

  • name: String
  • type: String
  • createdBy: ObjectId ref to User
  • sharedWith: Array of { user: ObjectId, role: String }
  • isLocked: Boolean (indicates if template is being edited)
  • lockedBy: ObjectId ref to User (who holds the lock)
  • lockTimestamp: Date (when lock was acquired)

SavedTemplate: User-saved versions of base templates.

  • name: String
  • baseTemplateId: ObjectId ref to Template
  • userId: ObjectId ref to User
  • sharedWith: Array of { user: ObjectId, role: String }

Entity Relationship Diagram (ER)

erDiagram
    User {
        ObjectId _id
        String email
        String name
        String passwordHash
        Date createdAt
    }

    Template {
        ObjectId _id
        String name
        String type
        ObjectId createdBy FK
        Array sections
        Object styles
        Date createdAt
    }

    Document {
        ObjectId _id
        String title
        ObjectId templateId FK
        ObjectId userId FK
        Object content
        Object stylesOverride
        String pdfUrl
        Date createdAt
    }

    AIDocument {
        ObjectId _id
        String name
        String documentType
        String sourceText
        Array sections
        Object styles
        ObjectId userId FK
        Date createdAt
    }

    CustomTemplate {
        ObjectId _id
        String name
        String type
        ObjectId createdBy FK
        Array sharedWith
        Boolean isLocked
        ObjectId lockedBy FK
        Date lockTimestamp
    }

    SavedTemplate {
        ObjectId _id
        String name
        ObjectId baseTemplateId FK
        ObjectId userId FK
        Array sharedWith
    }

    %% User 1-to-Many relationships
    User ||--o{ Document : "owns"
    User ||--o{ AIDocument : "owns"
    User ||--o{ Template : "creates"
    User ||--o{ CustomTemplate : "creates"
    User ||--o{ SavedTemplate : "owns"

    %% Template 1-to-Many relationships
    Template ||--o{ Document : "is_base_for"
    Template ||--o{ SavedTemplate : "is_base_for"

    %% User Many-to-Many relationships via sharedWith array
    User }o--o{ CustomTemplate : "is_shared_with"
    User }o--o{ SavedTemplate : "is_shared_with"

    %% Lock relationship
    User ||--o{ CustomTemplate : "locks"
Loading

7. API Endpoints

All routes are prefixed with /api. Routes marked with (auth) require a valid JWT.

Method Endpoint Description
POST /auth/signup Creates a new user and returns a token
POST /auth/login Authenticates a user and returns a token
GET /templates Fetches all available pre-built templates
GET /custom-templates/mine (auth) Fetches all custom templates owned by or shared with the user
POST /custom-templates/create (auth) Creates a new custom template
PUT /custom-templates/:id (auth) Updates a custom template
DELETE /custom-templates/:id (auth) Deletes a custom template
POST /custom-templates/:id/lock (auth) Acquires a lock on the template for editing
POST /custom-templates/:id/unlock (auth) Releases the lock on the template
GET /documents/mine (auth) Fetches all saved documents for the user
DELETE /documents/:id (auth) Deletes a saved document
POST /ai-document/generate (auth) Generates a new document from text using AI
POST /ai-document/upload-file (auth) Uploads a file and returns its text content for AI generation
GET /ai-document/mine (auth) Fetches all AI-generated documents for the user
GET /ai-document/:id (auth) Fetches a specific AI-generated document
PUT /ai-document/:id (auth) Updates the content or styles of an AI-generated document
POST /ai-document/:id/save-as-document (auth) Saves an AI document as a permanent document with a PDF
POST /export/custom-pdf (auth) Exports a custom template layout to a PDF file
POST /export/docx (auth) Exports a document to DOCX format
POST /enhance/field (auth) Uses AI to enhance the content of a specific text field

8. Security Measures

Authentication & Authorization

JWT Authentication: API access is controlled via JWT. The authRequired middleware in server/src/utils/auth.js protects sensitive routes.

Authorization Scoping: Database queries for templates and documents are scoped to the authenticated user's ID (req.user._id), ensuring users can only access their own data.

Ownership Enforcement: All mutation operations (update, delete) verify that the requesting user is the owner before proceeding.

Authentication Middleware (auth.js)

The core of the API's security is handled by two middleware functions that inspect the Authorization header for a JWT.

  • authRequired: Used for routes that require a logged-in user. Extracts the token, verifies it, and fetches the user from the database. Returns 401 Unauthorized if the token is missing or invalid.

  • authOptional: Used for public routes that can provide an enhanced experience for logged-in users. Attempts to verify a token if one is present but does not throw an error if missing or invalid. Allows the route to serve both anonymous and authenticated users.

import jwt from "jsonwebtoken"
import User from "../models/User.js"

function extractToken(req) {
  const h = req.headers?.authorization || ""
  if (!h.startsWith("Bearer ")) return null
  return h.slice("Bearer ".length)
}

export async function authOptional(req, _res, next) {
  try {
    const token = extractToken(req)
    if (!token) return next()
    const payload = jwt.verify(token, process.env.JWT_SECRET)
    if (payload?._id) {
      try {
        const user = await User.findById(payload._id).select("_id email name")
        if (user) req.user = user
      } catch {}
    }
  } catch {
    // ignore invalid token in optional mode
  }
  next()
}

export async function authRequired(req, res, next) {
  try {
    const token = extractToken(req)
    if (!token) return res.status(401).json({ error: "Unauthorized" })
    const payload = jwt.verify(token, process.env.JWT_SECRET)
    const user = await User.findById(payload._id).select("_id email name")
    if (!user) return res.status(401).json({ error: "Unauthorized" })
    req.user = user
    next()
  } catch (err) {
    return res.status(401).json({ error: "Unauthorized" })
  }
}

Additional Security Measures

Password Hashing: User passwords are never stored in plaintext. They are hashed using bcryptjs before being saved to the database.

Secure Headers: Helmet is used to set various security-related HTTP headers to protect against common web vulnerabilities.

CORS: Cross-Origin Resource Sharing is configured to only allow requests from the frontend application's domain. Credentials are opt-in.

File Upload Restrictions: multer is configured with strict file size limits and file type filters to prevent abuse.

Payload Validation: zod schema validation ensures that all incoming payloads conform to expected structures, preventing malformed content from being processed.


9. Local Setup & Installation

The project consists of two separate applications. You will need to run them in two separate terminal sessions.

Prerequisites

  • Node.js (v18 or later)
  • MongoDB (local instance or a cloud service like MongoDB Atlas)
  • Google Generative AI API key (for Gemini access)

Server Setup

  1. Navigate to the server directory:

    cd server
  2. Install dependencies:

    npm install
  3. Create a .env file and populate it based on .env.example:

    PORT=4000
    MONGODB_URI=your_mongodb_connection_string
    JWT_SECRET=your_strong_jwt_secret
    GOOGLE_GENERATIVE_AI_API_KEY=your_gemini_api_key
  4. (Optional) Seed the database with pre-built templates:

    npm run seed
  5. Start the server:

    npm start

Client Setup

  1. Navigate to the client directory:

    cd client
  2. Install dependencies:

    npm install
  3. Create a .env file and point it to your local server:

    VITE_API_URL=http://localhost:4000
  4. Start the client development server:

    npm run dev

The application should now be running at http://localhost:5173 (or another port specified by Vite).


10. Future Work

Advanced Font Management: Allow users to upload custom fonts and embed them in PDF exports for personalized typography.

Real-time Collaboration: Implement WebSocket-based features for multi-user editing on custom templates with live presence indicators.

Version History: Add a system to track and revert changes for both documents and templates, allowing users to restore previous versions.

Enhanced Rich Text: Support more complex inline styling (e.g., multiple colors, font sizes, and effects within a single text block).

Template Library Sharing: Enable users to publish and share templates with the community, with rating and download functionality.

Export Formats: Extend export capabilities to support additional formats like HTML, Markdown, and LaTeX.

Mobile Optimization: Enhance the UI and interactions for mobile devices with touch-friendly controls.


Feature → Package/Technology Mapping

This section provides a detailed breakdown of which technologies power each feature.

Core Backend & API

  • Server Framework: express (for routing, middleware, and handling HTTP requests)
  • Security: helmet (secures HTTP headers), cors (manages cross-origin requests)
  • Authentication: jsonwebtoken (JWT generation/validation), bcryptjs (password hashing), custom auth middleware
  • Database: mongoose (data modeling, validation, and communication with MongoDB)
  • File Handling: multer (for handling file uploads from the AI Builder)
  • Logging & Environment: morgan (HTTP request logging), dotenv (manages environment variables)

Core Frontend & UI

  • Framework: react (for building the user interface components)
  • Build Tool: vite (fast development server and production bundling)
  • Routing: react-router-dom (manages all client-side pages and navigation, including protected routes)
  • Styling: tailwindcss (utility-first CSS framework for all UI components and layouts)
  • HTTP Communication: axios (for all API requests, configured with an interceptor to automatically attach JWT tokens)

Feature 1: Pre-built Templates

Server:

  • routes/templates.js, routes/documents.js: API endpoints for fetching templates and creating/managing documents based on them
  • seed/seedTemplates.js: Script to populate the database with default templates (Resume, Letter, etc.)

Client:

  • pages/Dashboard.jsx, pages/Editor.jsx: UI for selecting templates and filling out the corresponding forms
  • components/Preview.jsx: Renders a live preview of the document as the user types

Feature 2: Custom Template Builder

Server:

  • routes/customTemplates.js: API endpoints for saving and retrieving custom-built template layouts
  • models/CustomTemplate.js: Mongoose schema stores sections with textBlocks and shapes, including their positions and styles
  • services/lockService.js: Manages template locking for concurrent edit prevention

Client:

  • pages/Builder.jsx: The main page for the custom builder experience
  • dnd-kit: Powers the drag-and-drop functionality for reordering sections
  • components/DragDropBuilder.jsx: Core component managing the canvas-like interactions
  • components/CustomPreview.jsx: Renders the free-form layout with absolute positioning
  • localStorage: Used for auto-saving the builder state to prevent data loss on refresh

Feature 3: AI Document Builder

Server:

  • services/aiDocumentService.js: The heart of the AI feature. Contains the logic for interacting with the Gemini API
  • @google/generative-ai: The official Google SDK used to call the gemini-1.5-flash model
  • Prompt Engineering: A detailed system prompt guides the AI to produce structured, well-formatted JSON output based on unstructured text
  • Post-processing: Functions like validateAndEnhanceDocument and recalculatePositions clean, validate, and refine the AI's output
  • routes/aiDocument.js, routes/upload.js: API endpoints to handle text/file submission and manage the lifecycle of AI-generated documents

Client:

  • pages/Upload.jsx: UI for pasting text or uploading a file to be processed by the AI
  • pages/AIDocumentEditor.jsx: A dedicated editor for viewing, modifying, and exporting the document generated by the AI

Feature 4: Document Exports

Server:

  • routes/exports.js: Contains the API endpoints for pdf and docx generation
  • pdf-lib: Used to generate PDF files by programmatically placing text and shapes at specific coordinates, ensuring high fidelity with the on-screen preview
  • docx: Used to generate DOCX files by creating structured paragraphs with appropriate alignment (left, center, right), prioritizing readability over exact positioning

Contributing

Contributions are welcome! Please ensure that:

  1. All new features include corresponding unit tests
  2. Security measures are reviewed for new endpoints
  3. Database schema changes are backward compatible
  4. Frontend components follow the established Tailwind CSS patterns

License

This project is licensed under the MIT License. See LICENSE file for details.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages