Skip to content

digwis/shidian

Repository files navigation

📜 Shidian / 识典古籍

A desktop reader for ancient Chinese texts — original + AI translation, side by side.

English · 中文


🌐 English

Shidian is an open-source cross-platform desktop application for reading and studying ancient Chinese books (古籍). It pairs the original text with modern Chinese translations, lets you build a local offline library, OCR scans printed pages into clean digital text, and exports your collection to EPUB / Markdown / PDF.

The backend talks to shidianguji.com for catalogue and chapter data, caches everything locally, and stays usable even when the upstream site is unreachable.

✨ Features

  • 📖 Side-by-side Reader — Original classical Chinese and modern translation on the same page. Toggle between 对照 / 原文 / AI译文 with a single click.
  • 🔤 OpenCC Script Conversion — Instantly switch between Simplified (cn) and Traditional (tw) Chinese.
  • 🤖 AI Translation — Built-in OpenAI / Gemini support for fresh translations of chapters that don't have a published modern version yet.
  • 🗃️ Local Library — Downloaded books live on disk. Search across the whole library, jump to the last reading position, and keep your progress per book.
  • 🖨️ OCR for Scanned Pages — PaddleOCR-powered pipeline (Python) turns scanned ancient-book page images into clean, paragraph-segmented text and EPUB.
  • 📚 Remote Library Sync — Caches categories, search results, and chapter content from the upstream catalogue so subsequent reads are instant.
  • 📤 Multi-format Export — Export any chapter or full book to Markdown, PDF, or EPUB from the UI.
  • 🔍 Full-text Search — Search inside downloaded books for characters, phrases, or sentences with hit highlighting.
  • 🈶 Rare Character Rendering — Custom font stack (Jigmo, Source Han Serif) with automatic glyph fallback so uncommon characters render correctly out of the box.
  • 🌓 Dark Mode — System-aware theme with manual override.
  • 🌍 Desktop-first — Wrapped as an Electron app, with the backend bundled in. One .dmg and you have a working reader; no extra services to run.

🖼️ Screenshots

Screenshots coming soon — PRs welcome!

🛠️ Tech Stack

Layer Technology
Shell Electron 33 · electron-vite · electron-builder
Frontend React 19 · TypeScript · Vite 8
Backend Node.js · Express 5 · TypeScript · tsx
Scraping Playwright (Chromium) · cheerio
Translation OpenAI · Google Gemini
OCR / NLP Python 3.11 · PaddleOCR · PaddlePaddle
Script conversion opencc-js
Persistence File-based persistent cache · PostgreSQL (optional)
Build electron-vite · Vite · TypeScript · Vitest

📦 Installation

Download the latest release for your platform from the Releases page.

Platform Format
macOS (Apple Silicon & Intel) .dmg / .zip
Linux .AppImage / .deb (planned)
Windows .exe (planned)

macOS builds are produced automatically by GitHub Actions; the first launch may require right-click → Open to bypass Gatekeeper.

🔨 Development

# 1. Clone
git clone https://github.com/digwis/shidian.git
cd shidian

# 2. Install JS dependencies (frontend + backend + electron)
npm install
(cd frontend && npm install)
(cd backend && npm install)

# 3. (Optional) Set up the Python OCR environment
#    Only required if you want to run the OCR → MD / EPUB pipeline.
cd backend
python3 -m venv venv311
source venv311/bin/activate
pip install paddleocr paddlepaddle

# 4. (Optional) Configure translation providers
echo "OPENAI_API_KEY=sk-..." > .env
echo "GEMINI_API_KEY=..."    >> .env
cd ..

# 5. Run the web-only dev server (frontend + backend on different ports)
npm run dev
# frontend: http://127.0.0.1:5188
# backend : http://localhost:3001/api/health

# 6. Run as a desktop app (Electron + Vite + backend, hot reload)
npm run electron:dev

# 7. Package a distributable for the current platform
npm run electron:build:dir   # unpacked .app under dist-electron/
npm run electron:build       # signed-ready .dmg

🧪 Tests

# Vitest unit tests (reader utils, character fallback, routing, etc.)
npm test

# Electron-specific node test
npm run test:electron

# Type-check both Electron and renderer projects
npm run typecheck

🗂️ Project Layout

shidian/
├── frontend/          # React 19 + Vite client
├── backend/           # Express API + scrapers + Python OCR scripts
├── shared/            # Code shared between renderer and backend
├── electron/          # Electron main + preload (legacy entry)
├── src/               # Electron-vite source (main / preload / renderer)
├── build/             # Icons & build resources
├── docs/              # Specs and design notes
├── electron.vite.config.js
├── electron-builder.yml
└── package.json       # Workspace-style root with dev / build scripts

⚙️ Environment Variables

Variable Default Purpose
PORT 3001 Backend HTTP port
DATABASE_URL Optional Postgres URL for the persistent cache
PYTHON_BIN python3 Python interpreter used by OCR scripts
OPENAI_API_KEY Enables OpenAI translation
OPENAI_MODEL / OPENAI_TRANSLATE_MODEL gpt-5-mini OpenAI model
GEMINI_API_KEY Enables Gemini translation
GEMINI_TRANSLATE_MODEL gemini-2.5-flash Gemini model
REMOTE_LIBRARY_CACHE_TTL_MS 24h Cache lifetime for remote book data
REMOTE_LIBRARY_CATEGORIES_CACHE_TTL_MS 365d Cache lifetime for category tree

See backend/index.ts for the full list.

🤝 Contributing

  1. Fork the repo and create a feature branch: git checkout -b feature/my-change
  2. npm run lint && npm test
  3. Open a Pull Request describing the change, screenshots, and any new env vars.

📄 License

MIT — free and open source.


📜 中文

识典古籍 是一个开源的跨平台桌面应用,用来阅读和研究中文古籍。它把原文和现代白话译文并排展示,支持本地书库、扫描页 OCR、章节搜索,以及把整本书导出为 EPUB / Markdown / PDF。

后端默认从 shidianguji.com 抓取目录与正文,并把所有内容缓存到本地;在网络不可用时,已下载的内容依然可以正常阅读。

✨ 功能特性

  • 📖 对照阅读 — 原文与现代译文并排呈现,可一键切换「对照 / 原文 / AI译文」三种模式。
  • 🔤 简繁转换 — 基于 OpenCC,一键在简体与繁体之间切换。
  • 🤖 AI 翻译 — 内置 OpenAI / Gemini 接入,没有现成译文的章节也能即时生成现代白话。
  • 🗃️ 本地书库 — 下载的书籍保存在本地,支持整库搜索、断点续读、按书保存阅读进度。
  • 🖨️ 扫描页 OCR — 基于 PaddleOCR 的 Python 流水线,把扫描版古籍图片整理成结构清晰的文本 / Markdown / EPUB。
  • 📚 远端书库缓存 — 抓取的目录、搜索结果、章节内容都会缓存到本地,再次打开几乎是秒开。
  • 📤 多格式导出 — UI 内一键把任意章节或整本书导出为 Markdown、PDF 或 EPUB。
  • 🔍 全文检索 — 在本地书库内跨书搜索字、词、句,结果支持高亮跳转。
  • 🈶 生僻字渲染 — 内置 Jigmo、思源宋体等字体栈,生僻字自动按字符级回退,不会出现「豆腐块」。
  • 🌓 深色模式 — 跟随系统主题,可手动切换。
  • 🖥️ 桌面体验 — 整个后端被 Electron 一起打包进 .dmg,双击即可阅读,无需额外启动服务。

🖼️ 截图

截图待补充,欢迎在 PR 中附上!

🛠️ 技术栈

层级 技术
外壳 Electron 33 · electron-vite · electron-builder
前端 React 19 · TypeScript · Vite 8
后端 Node.js · Express 5 · TypeScript · tsx
抓取 Playwright(Chromium)· cheerio
翻译 OpenAI · Google Gemini
OCR / NLP Python 3.11 · PaddleOCR · PaddlePaddle
简繁转换 opencc-js
持久化 文件级缓存 · PostgreSQL(可选)
构建 electron-vite · Vite · TypeScript · Vitest

📦 安装

前往 Releases 页面下载对应平台的最新版本。

平台 格式
macOS(Apple Silicon 与 Intel) .dmg / .zip
Linux .AppImage / .deb(规划中)
Windows .exe(规划中)

macOS 安装包由 GitHub Actions 自动构建;首次启动如被 Gatekeeper 拦截,请在「系统设置 → 隐私与安全性」中点击「仍要打开」。

🔨 开发

# 1. 克隆仓库
git clone https://github.com/digwis/shidian.git
cd shidian

# 2. 安装前后端 / Electron 依赖
npm install
(cd frontend && npm install)
(cd backend && npm install)

# 3. (可选)准备 Python OCR 环境
#     只有在跑 OCR → MD / EPUB 流水线时才需要。
cd backend
python3 -m venv venv311
source venv311/bin/activate
pip install paddleocr paddlepaddle
cd ..

# 4. (可选)配置翻译服务
echo "OPENAI_API_KEY=sk-..." > backend/.env
echo "GEMINI_API_KEY=..."    >> backend/.env

# 5. 仅跑 Web 模式(前端 + 后端分别监听不同端口)
npm run dev
# 前端:http://127.0.0.1:5188
# 后端:http://localhost:3001/api/health

# 6. 以桌面应用方式启动(Electron + Vite + 后端,热更新)
npm run electron:dev

# 7. 打包当前平台
npm run electron:build:dir   # 生成未压缩的 .app
npm run electron:build       # 生成可分发的 .dmg

🧪 测试

# Vitest 单元测试(阅读器工具、生僻字回退、路由等)
npm test

# Electron 相关的 Node 测试
npm run test:electron

# 同时类型检查 Electron 与渲染进程工程
npm run typecheck

🗂️ 目录结构

shidian/
├── frontend/          # React 19 + Vite 客户端
├── backend/           # Express 接口 + 抓取 + Python OCR 脚本
├── shared/            # 渲染端与后端共享的代码
├── electron/          # Electron 主进程 / preload(旧入口)
├── src/               # electron-vite 源码(main / preload / renderer)
├── build/             # 图标与构建资源
├── docs/              # 设计与规格说明
├── electron.vite.config.js
├── electron-builder.yml
└── package.json       # 根工作区,统一管理 dev / build 脚本

⚙️ 环境变量

变量 默认值 用途
PORT 3001 后端 HTTP 端口
DATABASE_URL 可选:Postgres 持久化缓存地址
PYTHON_BIN python3 OCR 脚本使用的 Python 解释器
OPENAI_API_KEY 启用 OpenAI 翻译
OPENAI_MODEL / OPENAI_TRANSLATE_MODEL gpt-5-mini OpenAI 模型
GEMINI_API_KEY 启用 Gemini 翻译
GEMINI_TRANSLATE_MODEL gemini-2.5-flash Gemini 模型
REMOTE_LIBRARY_CACHE_TTL_MS 24h 远端书库缓存时长
REMOTE_LIBRARY_CATEGORIES_CACHE_TTL_MS 365d 分类树缓存时长

完整列表见 backend/index.ts

🤝 参与贡献

  1. Fork 仓库后新建分支:git checkout -b feature/my-change
  2. 跑一遍 npm run lint && npm test
  3. 提交 Pull Request,附上变更说明、截图与新增的环境变量。

📄 许可证

MIT — 免费开源。


Made with 📚 by digwis

About

📜 Shidian / 识典古籍 — A desktop reader for ancient Chinese texts with side-by-side original and AI translation.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors