Skip to content

mambabhi/indic-learn

Repository files navigation

title emoji colorFrom colorTo sdk app_file pinned
Indic-Learn App
πŸ“š
purple
indigo
gradio
app.py
false

Image

indic-learn is a modular AI-powered toolkit designed to support culturally grounded educational content for the Gurukula learning platform. Its first major module, quiz, is a fully automated pipeline for generating high-quality single and multiple-choice quizzes (SCQ and MCQ) from Indian story texts. Built for scale and precision, it leverages concurrent LLM agents (via Groq and Agno) to produce rich, classroom-ready assessments with minimal manual intervention.

Future modules coming soon include:

  • storyboard β€” frame-by-frame visual generation from story chapters for use in illustrated books and slide decks.
  • animation β€” culturally customized animated storytelling with rich textures and Indic design sensibilities.

πŸ“˜ Features of the quiz module

  • Parallel SCQ and MCQ generation using dedicated Groq-hosted ChatGPT OSS agents via Agno.
  • Intelligent fallback strategy: if MCQ generation produces too few valid questions, SCQs are added to maintain the desired count.
  • Optional extra questions included for manual curation by Gurukula admins.
  • Automatic filtering of invalid or low-quality questions using format checks and semantic similarity deduplication.
  • Points and timers vary based on question difficulty and type.
  • All content written to a Google Sheet with per-chapter tabs for easy review and classroom use.
  • Correct answers are highlighted automatically via the Google Sheets API.

πŸ›  Setup

You’ll need a .env and app_config.yaml file with the following (simplified for this README).

# app_config.yaml

# NEW: Default mode configuration (recommended)
source_documents:
  input_link: https://docs.google.com/document/d/YOUR_DOC_ID/edit
  output_link: https://docs.google.com/spreadsheets/d/YOUR_SHEET_ID/edit
  num_questions: 15

  batch:
    - input_link: https://docs.google.com/document/d/DOC_ID_1/edit
      output_link: https://docs.google.com/spreadsheets/d/SHEET_ID_1/edit
      num_questions: 15

# Legacy mode configuration (for backward compatibility)
spreadsheets:
  input_name: gurukula-story-master
  output_name: gurukula-quiz-master

documents:
  link: https://docs.google.com/document/d/YOUR_DOC_ID/edit

chapter_question_counts:
  chapter23: 2
# .env -- you will need the env variables setup secrets in Huggingface space
SERVICE_ACCOUNT_FILE=path/to/google-service-key.json
GOOGLE_SCOPES=https://www.googleapis.com/auth/spreadsheets,https://www.googleapis.com/auth/drive,https://www.googleapis.com/auth/documents
GOOGLE_SERVICE_ACCOUNT_KEY_BASE64=placeholder_for_base64_encoded_key

πŸš€ How to Generate Quizzes

Using the main script on command line (Recommended)

For all invocations use: quiz/backend/gurukula_quizgen.py

Default Mode (Recommended) - PRIMARY

The default mode reads configuration directly from app_config.yaml, picking up inputs, outputs, and question counts automatically. This is the simplest and recommended way to test.

  1. Run with default configuration (no arguments needed):
python -m quiz.backend.gurukula_quizgen
  1. Run with custom question count:
python -m quiz.backend.gurukula_quizgen --num_questions 20
  1. Run batch mode (process multiple doc/sheet pairs):
python -m quiz.backend.gurukula_quizgen --batch

Configure in app_config.yaml:

source_documents:
  input_link: https://docs.google.com/document/d/DOC_ID/edit
  output_link: https://docs.google.com/spreadsheets/d/SPREADSHEET_ID/edit
  num_questions: 15

  batch:
    - input_link: https://docs.google.com/document/d/DOC_ID_1/edit
      output_link: https://docs.google.com/spreadsheets/d/SHEET_ID_1/edit
      num_questions: 15

Legacy Modes (Backward Compatible)

  1. Run a single chapter from file:
python -m quiz.backend.gurukula_quizgen --mode file --chapter chapter23
  1. Run all chapters in batch from file:
python -m quiz.backend.gurukula_quizgen --mode file
  1. Run a single chapter from Google Sheets:
python -m quiz.backend.gurukula_quizgen --mode spreadsheet --chapter chapter23
  1. Run from Google Sheets (all tabs):
python -m quiz.backend.gurukula_quizgen --mode spreadsheet

πŸ“– For detailed testing instructions, see: quiz/backend/TESTING_DEFAULT_MODE.md πŸ“š For complete developer guide, see: quiz/backend/CLAUDE.md


Using the Hugging Face Spaces Web UI


🧾 Input Doc Format (gurukula-story-master)

This can be any Google doc with a proper Chapter title which will be used to create a tab in the output quiz-master spreadsheet below.


🧾 Input Sheet Format (gurukula-story-master)

Each chapter should be a separate tab. The layout within a tab should look like:

A B
NumQuestions 15
Content Once upon a time in a village... (full story content goes here)

πŸ“Œ Chapters without valid content or missing tabs are skipped automatically.


πŸ“€ Output Sheet Format (gurukula-quiz-master)

  • Each chapter gets a tab with the generated questions.
  • SCQs and MCQs are mixed and randomized.
  • Correct options are highlighted in green.
  • Currently, for GDOC this is hardcoded as the fixed output spreadsheet

πŸ§ͺ Screenshots

βœ… Story GDoc

Image
Figure 1: Input GDoc with story content

βœ… Story Spreadsheet Tab (gurukula-story-master)

Image
Figure 2: Input spreadsheet with story content and number of questions.

βœ… Generated Quiz Tab (gurukula-quiz-master)

Image
Figure 3: Generated quiz questions with highlights in output sheet.

βœ… Hugging Face Spaces UI for Quiz Generation

Image
Figure 4: Hugging Face Spaces UI for interactive quiz generation and management.

βœ… Final Quiz in Gurukula App

Image


Figure 5: Gurukula app displaying quiz in interactive mode.

πŸ“‹ Features

⚑ Parallel Quiz Generation

  • Uses ChatGPT OSS model in concurrent threads:

    • One generates SCQs (Single Correct Questions)
    • Another generates MCQs (Multiple Correct Questions)
  • Reduces generation time and avoids rate limits

βœ… Intelligent Question Filtering

  • Validates MCQs to ensure they contain multiple correct options
  • De-duplicates questions using text similarity checks
  • Adds extra SCQs for optionality if MCQ count is low

🧠 Quiz Metadata & UX

  • Supports difficulty-based Points and Timer settings
  • Highlights correct options in green using Google Sheets formatting
  • Allows flexible quiz size per chapter (configurable)

πŸ“€ Google Doc & Sheets Integration

  • Reads chapters directly from Google Docs or Google sheets tabs
  • Writes quizzes to a separate output Google Sheet
  • Enables easy use by Gurukula admins via docs & spreadsheets

πŸ›  Tech Stack

  • 🧠 ChatGPT OSS via Groq API (served with Markdown using Agno agent)
  • πŸ“Š Google Docs & Sheets API (via gspread, googleapiclient)
  • 🐍 Python 3.11
  • βœ… Configurable with YAML-based settings

🧩 Key Functionality

1. Quiz Generation Pipeline

  • Parallel Generation: ChatGPT OSS model (via Groq API and Agno agent) run in parallel threadsβ€”one for SCQs (single correct answer), one for MCQs (multiple correct answers).
  • Prompt Engineering: Carefully crafted prompts instruct LLMs to generate questions in strict JSON format, with clear rules for options, correct answers, and structure.
  • Validation & Filtering: MCQs are validated to ensure multiple correct options. Questions are de-duplicated using text similarity checks.
  • Fallback Logic: If not enough valid MCQs are generated, extra SCQs are added to maintain the desired quiz size.

2. Google Docs & Sheets Integration

  • Input: Reads story chapters from a Google Sheet (each chapter is a tab with NumQuestions and Content fields) OR from a Google Doc (each doc is a Chapter).
  • Output: Writes generated quizzes to a separate output Google Sheet, with each chapter as a tab. Correct answers are highlighted using conditional formatting.
  • Configurable: Sheet names, question counts, and other settings are loaded from the sheets or YAML and .env files for GDoc and File formats, when ran from the CLI.

3. Quiz Structure

  • SCQ: Single correct answer, labeled as a single letter (e.g., "a").
  • MCQ: Multiple correct answers, labeled as a string of letters (e.g., "ac"), with validation to ensure at least two correct options.
  • Question Format: Each question includes the text, options (labeled "a."–"d."), correct answer(s), points, and timer.

4. User Interfaces

  • CLI: Main script can be run for a single chapter or batch mode, from file or spreadsheet or gdoc.
  • Gradio UI: Web interface for admins to trigger quiz generation, select chapters, and view logs.

5. Extensibility

  • Pluggable Pipeline: Modular quiz generation logic allows for future expansion (e.g., storyboard, animation modules).
  • Logging: All steps are logged for debugging and traceability.

πŸ“ Key Definitions

  • SCQ (Single Choice Question): Only one correct answer.
  • MCQ (Multiple Choice Question): Two or more correct answers, must be a string of unique letters (e.g., "bd").
  • QuizParser: Parses and normalizes the LLM output into the expected JSON structure.
  • Conditional Formatting: Correct options are highlighted in green in the output Google Sheet.

πŸ“ Project Structure

indic-learn/
β”œβ”€β”€ backend/
β”‚   β”œβ”€β”€ gurukula_quizgen.py              # Main CLI script for quiz generation
β”‚   β”œβ”€β”€ indic_quiz_generator_pipeline.py # Quiz generation logic, prompt building, validation, parsing
β”‚   β”œβ”€β”€ test_pipeline.py                 # Backend test script for pipeline
β”‚   β”œβ”€β”€ tests/                           # (Optional) Test scripts
β”‚   └── utils/
β”‚       β”œβ”€β”€ gsheets.py                   # Google Sheets & Docs utilities
β”‚       └── logging_utils.py             # Logging utilities
β”œβ”€β”€ config/
β”‚   β”œβ”€β”€ app_config.yaml                  # App configuration (sheet names, doc links, etc.)
β”‚   └── ...                              # Other config files
β”œβ”€β”€ data/                                # Text chapters for file-based mode
β”œβ”€β”€ app.py                               # Gradio app for Hugging Face Spaces
β”œβ”€β”€ test_app.py                          # Local/test Gradio app
β”œβ”€β”€ requirements.txt                     # Main requirements for Hugging Face Spaces
β”œβ”€β”€ requirements-local.txt               # (Optional) Local-only requirements
└── README.md                            # You're here

🀝 About

This project supports Indic education by making quiz creation for Indian stories effortless, scalable, and teacher-friendly.

Questions? Want to contribute? Reach out to @mambabhi on GitHub.


βœ… Coming Soon

Quiz module

  • UI to configure and run quizzes without CLI
  • Analytics on question types per chapter
  • Export to CSV/HTML formats

Storyboard module

  • Stay tuned...

Animation module

  • Stay tuned...

✨ Built with ❀️ for Indic education.


Β© 2025 Gurukula. All rights reserved.
This project is maintained in partnership with the Gurukula learning platform.

About

AI-powered tools for culturally rooted education: indic-learn generates quizzes, storyboards, and animations from Indic story texts for use in platforms like Gurukula.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors