Skip to content

sambhavdwivediofficial/Sophon-AI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SOPHON — Neural Language Model

SOPHON is an experimental transformer-based neural language model developed for autoregressive text modeling research.

This repository contains the core model architecture, training pipeline, configuration system, and utilities required for large-scale language modeling experimentation.


Overview

SOPHON is designed with a minimal and modular structure:

  • Custom transformer architecture
  • Config-driven model parameters
  • Scalable training pipeline
  • JSONL dataset ingestion
  • Modular utilities and training scripts

The system is intended for research, experimentation, and architectural development in deep learning.


Project Structure

data/           → Training datasets (JSONL format)
src/
    ├── model.py      → Core model architecture
    ├── train.py      → Training script
    ├── config.py     → Configuration system
    ├── utils.py      → Utility functions
    └── chat.py       → Inference / interaction script

Status

This project represents an early-stage experimental model architecture.
Training and scaling improvements are ongoing.


Usage

Training

Start model training from the project root directory:

python -m src.train

Chat / Inference

Launch interactive chat mode:

python -m src.chat

Requirements

  • Python 3.10+
  • PyTorch
  • Standard scientific Python stack

License

This project is licensed under the MIT License — see the LICENSE file for details.

Author

Sambhav Dwivedi
Website: sambhavdwivedi.in


© 2026 Sambhav Dwivedi

About

Sophon-AI is a from-scratch transformer-based language model implementation focused on custom training pipelines, configurable architectures, and open-source experimentation.

Topics

Resources

License

Stars

Watchers

Forks

Contributors

Languages