Canvas to Local Storage Sync

Sync content from Canvas LMS to local storage with incremental updates.

This project pulls course content (assignments, pages, files, discussions, and optional JSON reports) from Canvas and stores it in a local folder structure. It is local-storage-only and optimized to skip unchanged resources on repeat runs.

Features

Interactive course selection (all, specific numbers, or last selection)
Incremental sync with timestamp-based change detection
Assignment export to Markdown (including rubric/details)
Page export to Markdown (including page body)
Discussion export to Markdown plus optional course-level discussion JSON
Linked file discovery from assignments/pages/discussions (/files/{id} links)
PDF handling:
- Saves original PDF files
- Extracts PDF content to *_pdf.md using opendataloader_pdf
Optional course reports (JSON): announcements, quizzes, enrollments, calendar events, groups, analytics, gradebook history, submissions summary
Optional global inbox conversations export (Conversations/conversations.json)
Endpoint auto-disable for unavailable Canvas APIs (HTTP 403/404), persisted to config

Requirements

Python 3.10+
Java 11+ (required at runtime for PDF extraction flow)
Canvas API token with access to the courses you want to sync

If Java is missing, the app exits before sync starts.

Installation

Create/activate a virtual environment (recommended)
Install dependencies:

pip install -r requirements.txt

Create your config file from the example and update values:

copy config.ini.example config.ini

Configuration

Configure config.ini.

`[CANVAS]`

API_URL: Canvas base URL (example: https://yourschool.instructure.com)
API_KEY: Canvas API token

`[STORAGE]`

STORAGE_TYPE: must be local in this project
LOCAL_ROOT_DIR: root directory for synced output (example: ./canvas_sync)
FORCE_REGENERATE_ASSIGNMENTS: true/false; when true, assignment Markdown is regenerated even if unchanged

`[LAST_SELECTION]`

COURSE_IDS: comma-separated course IDs; managed automatically by the app

`[PERFORMANCE]` (optional)

REQUEST_TIMEOUT (default 20)
MAX_RETRIES (default 3)
BACKOFF_FACTOR (default 0.5)
CANVAS_PER_PAGE (default 100)
HTTP_POOL_MAXSIZE (default 20)

`[EXPORTS]`

Toggle optional exports with true/false:

EXPORT_ANNOUNCEMENTS (default true)
EXPORT_DISCUSSIONS (default true)
EXPORT_QUIZZES (default true)
EXPORT_ENROLLMENTS (default true)
EXPORT_CALENDAR_EVENTS (default true)
EXPORT_GROUPS (default true)
EXPORT_ANALYTICS_ACTIVITY (default true)
EXPORT_GRADEBOOK_HISTORY (default true)
EXPORT_SUBMISSIONS_SUMMARY (default false)
EXPORT_INBOX_CONVERSATIONS (default false)

If quizzes/analytics/gradebook endpoints return 403/404, the corresponding export can be auto-disabled and persisted to config.ini.

Usage

Run:

python main.py

You will be prompted to choose courses:

Enter numbers like 1,3,5
Enter all
Enter last to reuse previous selection
Enter quit to exit

At the end of the run, the script prints a summary and waits for Enter before exiting.

Output Structure

Under LOCAL_ROOT_DIR, each course gets its own folder. Typical layout:

canvas_sync/
  Course Name/
    Assignments/
      Assignment A/
        Assignment A.md
        linked_file.ext
    Discussions/
      Topic Title/
        Topic Title.md
        linked_file.ext
    Reports/
      announcements.json
      discussion_topics.json
      quizzes.json
      enrollments.json
      calendar_events.json
      groups.json
      analytics_activity.json
      gradebook_history.json
      submissions_summary.json
    Page Title/
      Page Title.md
      linked_file.ext
    SomeFile.pdf
    SomeFile_pdf.md
  Conversations/
    conversations.json

Exact files depend on what exists in Canvas and which exports are enabled.

Incremental Sync Behavior

The sync is designed to avoid unnecessary writes/downloads:

Existing local metadata is checked before saving resources
Change detection is primarily timestamp-driven (updated_at vs local mtime)
Linked files discovered multiple times in one run are deduplicated by Canvas file ID
If a PDF is unchanged but its extracted *_pdf.md is missing, extraction is attempted

Run the tool twice in a row to verify unchanged resources are skipped.

Troubleshooting

401 Unauthorized
- Verify API_KEY and API_URL in config.ini
No courses listed
- Check token permissions and whether courses are date-restricted
403/404 on optional reports
- Some institutions disable specific endpoints; auto-disable may be applied for that export
Java check failure at startup
- Install Java 11+ and ensure it is on PATH
Slow sync
- Adjust [PERFORMANCE] values (CANVAS_PER_PAGE, HTTP_POOL_MAXSIZE, retries/timeouts)

Notes

Storage backends other than local filesystem are not supported in this codebase.
A temporary download folder is used during sync and cleaned up at the end.
The script starts a background hybrid server process for PDF extraction and stops it on completion.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
.github		.github
canvasync		canvasync
tests		tests
.gitignore		.gitignore
README.md		README.md
config.ini.example		config.ini.example
main.py		main.py
pdf2md.ipynb		pdf2md.ipynb
requirements.txt		requirements.txt
test_wikilinks.py		test_wikilinks.py
zip2md.ipynb		zip2md.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Canvas to Local Storage Sync

Features

Requirements

Installation

Configuration

`[CANVAS]`

`[STORAGE]`

`[LAST_SELECTION]`

`[PERFORMANCE]` (optional)

`[EXPORTS]`

Usage

Output Structure

Incremental Sync Behavior

Troubleshooting

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Canvas to Local Storage Sync

Features

Requirements

Installation

Configuration

[CANVAS]

[STORAGE]

[LAST_SELECTION]

[PERFORMANCE] (optional)

[EXPORTS]

Usage

Output Structure

Incremental Sync Behavior

Troubleshooting

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`[CANVAS]`

`[STORAGE]`

`[LAST_SELECTION]`

`[PERFORMANCE]` (optional)

`[EXPORTS]`

Packages