Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
4064 commits
Select commit Hold shift + click to select a range
f7b186d
Apply suggestion from @cubic-dev-ai[bot]
pirate Dec 31, 2025
e26a0f6
Fix hook file overwrites in plugin directory (#1732)
pirate Dec 31, 2025
3ae9410
Update TODO_process_tracking.md
pirate Dec 31, 2025
4285a05
Fix getEnvArray to parse JSON when '[' present, CSV otherwise
github-actions[bot] Dec 31, 2025
dfe6841
Merge branch 'dev' into claude/refactor-process-management-WcQyZ
pirate Dec 31, 2025
ca3f8f0
Process.launch()/kill()/.pidfile/.wait()/etc. centralize process hand…
pirate Dec 31, 2025
84a4fb0
fix cubic comments
pirate Dec 31, 2025
cead22a
archivebox <modelname> create|list|update|delete | ... piping support…
pirate Dec 31, 2025
fd9ba86
Reduce Chrome-related code duplication across JS and Python
claude Dec 31, 2025
04c23ba
Fix output path structure for 0.9.x data directory
claude Dec 31, 2025
987f4fb
Review output file paths and data directory structure (#1736)
pirate Dec 31, 2025
4394ce5
Reduce code duplication between Chrome utilities (#1737)
pirate Dec 31, 2025
65b93d5
tweak comment
pirate Dec 31, 2025
29eb628
tweak comment
pirate Dec 31, 2025
65c8390
Consolidate Chrome test helpers across all plugin tests
claude Dec 31, 2025
ef92a99
Refactor test_chrome.py to use shared helpers
claude Dec 31, 2025
7d74dd9
Add Chrome CDP integration tests for singlefile
claude Dec 31, 2025
d72ab7c
Add simpler Chrome test helpers and update test files
claude Dec 31, 2025
b73199b
Refactor background hook cleanup to use graceful termination
claude Dec 31, 2025
adeffb4
Add JS-Python path delegation to reduce Chrome-related duplication
claude Dec 31, 2025
0f46d8a
Add real-world use cases to CLI pipeline plan
claude Dec 31, 2025
524e8e9
Capture exit codes and stderr from background hooks
claude Dec 31, 2025
1cfb77a
Rename Python helpers to match JS function names in snake_case
claude Dec 31, 2025
8dab296
Consolidate Chrome test helpers across all plugin tests (#1738)
pirate Dec 31, 2025
1c85b4d
Refine use cases: 8 examples with efficient patterns
claude Dec 31, 2025
1bbb9b4
Change hook timeout enforcement strategy (#1739)
pirate Dec 31, 2025
3d8c62f
fix extensions dir paths add personas migration
pirate Dec 31, 2025
1d15901
fix process health stats
pirate Dec 31, 2025
95d61b0
fix migrations
pirate Dec 31, 2025
28a4f99
Add real-world use cases to CLI pipeline plan (#1740)
pirate Dec 31, 2025
f3e11b6
Implement JSONL CLI pipeline architecture (Phases 1-4, 6)
claude Dec 31, 2025
2d3a2fe
Add terminate, kill_tree, and query methods to Process model
claude Dec 31, 2025
b822352
Delete pid_utils.py and migrate to Process model
claude Dec 31, 2025
672ccf9
Add pluginmap management command
claude Dec 31, 2025
bb52b59
Add unit tests for JSONL CLI pipeline commands (Phase 5 & 6)
claude Dec 31, 2025
575a595
Add unit tests for JSONL CLI pipeline commands (Phase 5 & 6) (#1743)
pirate Dec 31, 2025
7dd2d65
Add pluginmap management command (#1742)
pirate Dec 31, 2025
b87bbbb
Fix CLI tests to use subprocess and remove mocks
claude Dec 31, 2025
ee201a0
Fix code review issues in process management refactor
github-actions[bot] Dec 31, 2025
2e6dcb2
Improve admin snapshot list/grid views with better UX
claude Dec 31, 2025
5121b0e
Merge branch 'dev' into claude/refactor-process-management-WcQyZ
github-actions[bot] Dec 31, 2025
0cb5f07
Add comprehensive tests for machine/process models, orchestrator, and…
claude Dec 31, 2025
b2132d1
Fix cubic review issues: process_type detection, cmd storage, PID cle…
github-actions[bot] Dec 31, 2025
bdb3d94
Delete pid_utils.py and migrate to Process model (#1741)
pirate Dec 31, 2025
a063d8c
Merge remote-tracking branch 'origin/dev' into claude/analyze-test-co…
claude Dec 31, 2025
bbbfffd
Improve admin snapshot list/grid views with better UX (#1744)
pirate Dec 31, 2025
9bf7a52
Update tests for new Process model-based architecture
claude Dec 31, 2025
8a0acde
Add SSL, redirects, SEO plugin tests and fix fake test issues
claude Dec 31, 2025
13148fd
Add DNS traffic recorder plugin
claude Dec 31, 2025
f2c20f1
Refactor dns plugin to use chrome_utils.js
claude Dec 31, 2025
73425fa
Add persona CLI command with browser cookie import
claude Dec 31, 2025
5d8c93e
Consolidate CDP connection logic into chrome_utils.js
claude Dec 31, 2025
08383c4
Fix tautological assertion in SEO test
github-actions[bot] Dec 31, 2025
20690fa
Fix CLI tests to use subprocess and remove mocks (#1746)
pirate Dec 31, 2025
cd0394c
Add comprehensive tests for machine/process models, orchestrator, and…
pirate Dec 31, 2025
47d9874
Merge remote-tracking branch 'origin/dev' into claude/dns-traffic-rec…
claude Dec 31, 2025
cfa5edb
Add tests for accessibility, parse_dom_outlinks, and consolelog plugins
claude Dec 31, 2025
9703a8e
Add tests for responses, staticfile, and env provider plugins
claude Dec 31, 2025
263335d
Add tests for merkletree and custom binary provider plugins
claude Dec 31, 2025
3659ade
Fix path traversal vulnerabilities in persona management
github-actions[bot] Dec 31, 2025
2a68248
Update all Chrome plugins to use shared chrome_utils.js
claude Dec 31, 2025
edc83bf
Add persona CLI command with browser cookie import (#1747)
pirate Dec 31, 2025
4839293
Fix test assertions to fail properly and add NXDOMAIN deduplication
github-actions[bot] Dec 31, 2025
1f84d1b
Fix test assertions to fail when data is missing
github-actions[bot] Dec 31, 2025
60a4581
Add tests for accessibility, parse_dom_outlinks, and consolelog plugi…
pirate Dec 31, 2025
cb97f66
Add DNS traffic recorder plugin (#1748)
pirate Dec 31, 2025
d5c0c64
fix progress bars
pirate Dec 31, 2025
72f6a91
more progress bar and migrations fixes
pirate Dec 31, 2025
469932b
more
pirate Dec 31, 2025
73fde81
more migrations tweaks
pirate Dec 31, 2025
bd75718
keep stripping healthstats from iface and other things
pirate Dec 31, 2025
f12c3b4
less healthstats
pirate Dec 31, 2025
17029ba
Add thumbnail strip to live progress monitor
claude Dec 31, 2025
4c77949
Clean up on_Crawl hooks: remove duplicates and standardize naming
claude Dec 31, 2025
0930911
Clean up on_Crawl hooks and remove dead code (#1751)
pirate Dec 31, 2025
a04e4a7
cleanup migrations, json, jsonl
pirate Dec 31, 2025
8f51850
comments
pirate Dec 31, 2025
4d33084
Remove redundant chrome_validate hook, rename wget_validate to wget_i…
claude Dec 31, 2025
6521e7d
more migrations fixes
pirate Jan 1, 2026
1c7b0cb
working migrations again
pirate Jan 1, 2026
09a1ca3
Fix hook priority conflicts and standardize on_Binary naming
claude Jan 1, 2026
2e2bc31
Remove redundant chrome_validate hook, rename wget_validate to wget_i…
pirate Jan 1, 2026
b08f60a
Add thumbnail previews to live progress header (#1753)
pirate Jan 1, 2026
f7457b1
more migrations fixes attempts
pirate Jan 1, 2026
e903fa1
Fix: Make SingleFile use SINGLEFILE_CHROME_ARGS with fallback to CHRO…
pirate Jan 1, 2026
6fadcf5
remove model health stats from models that dont need it
pirate Jan 1, 2026
876feac
actually working migration path from 0.7.2 and 0.8.6 + renames and te…
pirate Jan 1, 2026
60422ad
fix orchestrator statemachine and Process from archiveresult migrations
pirate Jan 2, 2026
9008cef
codecov, migrations, orchestrator fixes
pirate Jan 2, 2026
c2afb40
fix lib bin dir and archivebox add hanging
pirate Jan 2, 2026
65ee09c
move tests into subfolder, add missing install hooks
pirate Jan 2, 2026
3672174
fix transition mid transition
pirate Jan 2, 2026
dd77511
unified Process source of truth and better screenshot tests
pirate Jan 2, 2026
3da523f
more consistent crawl, snapshot, hook cleanup and Process tracking
pirate Jan 2, 2026
5449971
better kill tree
pirate Jan 2, 2026
839ae74
simplify entrypoints for orchestrator and workers
pirate Jan 4, 2026
456aaee
more migration id/uuid and config propagation fixes
pirate Jan 5, 2026
7ceaeae
rename archive_org to archivedotorg, add BinaryWorker, fix config pas…
pirate Jan 5, 2026
b80e804
more binary fixes
pirate Jan 5, 2026
0a2ac11
more binary fixes
pirate Jan 5, 2026
352e1ba
remove debug lines
pirate Jan 5, 2026
28b980a
higher timeout
pirate Jan 5, 2026
c2bb4b2
Implement native LDAP authentication support
github-actions[bot] Jan 5, 2026
eaf7256
Implement native LDAP authentication (#1756)
pirate Jan 6, 2026
c7b2217
tons of fixes with codex
pirate Jan 19, 2026
1cb2d50
bump version
pirate Jan 19, 2026
b5bbc3b
better tui
pirate Jan 19, 2026
bef6776
working singlefile
pirate Jan 19, 2026
86e7973
cleanup tui, startup, card templtes, and more
pirate Jan 19, 2026
f3f55d3
perfect snapshot detail cards
pirate Jan 19, 2026
ec4b270
wip
pirate Jan 21, 2026
36008fd
FIX: docker build
pellaeon Jan 30, 2026
1ca5452
FIX: uuid_compat
pellaeon Jan 31, 2026
9aa4f0d
FIX: The docker entrypoint doesn't have --quick-init
pellaeon Jan 31, 2026
dcfad7d
FIX: docker build (#1760)
pirate Jan 31, 2026
0d05fd8
Tag current maintainer of AUR package
jasongodev Feb 8, 2026
17e26ae
Delete TEST_RESULTS.md
pirate Feb 10, 2026
a0be8fe
Tag current maintainer of AUR package (#1761)
pirate Feb 11, 2026
08b0dfa
Fix #1139: Return tags as a JSON list in Snapshot.to_dict() for LLM/R…
Feb 21, 2026
c1b3e73
Fix #1139: Feature Request: Add AI-assisted summarization, tagging, s…
pirate Feb 24, 2026
fdef1f9
Update README with venv activation command
pirate Mar 14, 2026
5e6ba0b
Update Dockerfile, docker-compose.yml, and README for v0.9.0 plugin s…
claude Mar 15, 2026
f3fcc15
Restore Homebrew and Debian package manager support
claude Mar 15, 2026
d841be1
Update Dockerfile
pirate Mar 15, 2026
5b8e562
Update docker-compose.yml
pirate Mar 15, 2026
2f200f6
Fix review feedback: restore archivebox.localhost subdomain routing, …
claude Mar 15, 2026
37b8a01
Fix Dockerfile: restore \ continuation and run archivebox version as …
claude Mar 15, 2026
e532ffe
Update documentation and dependencies for v0.9.0 release (#1770)
pirate Mar 15, 2026
c8f562e
Wire up GitHub Actions for deb/brew build, test, and release
claude Mar 15, 2026
fa11bee
CI: Full brew install + deb install tested on every push
claude Mar 15, 2026
4c113f8
Fix CI: create tests/out dir, fix archivebox add cmd, revert setup.sh
claude Mar 15, 2026
1609094
Update .github/workflows/homebrew.yml
pirate Mar 15, 2026
7c7a9ee
Fix PR review comments: service flags, DATA_DIR, version pinning, upg…
claude Mar 15, 2026
496b54a
Fix remaining PR review comments: release ordering, verification, README
claude Mar 15, 2026
4db4c36
Add arm64 to .deb test matrix using GitHub's arm64 runners
claude Mar 15, 2026
6e77d11
Restore cache-apt-pkgs-action for test job build dependencies
claude Mar 15, 2026
2845e43
Fix .deb download URL in README to include version component
claude Mar 15, 2026
68fea71
Address remaining PR review comments
claude Mar 15, 2026
0fac8a7
Fix remaining PR review comments from all review rounds
claude Mar 15, 2026
8293281
Simplify deb/brew packages to thin wrappers around pip install
claude Mar 15, 2026
3501b39
Deduplicate homebrew.yml release job by reusing build_brew.sh
claude Mar 15, 2026
7300892
Fix PR review comments: version check, runtime deps, CI test deps, re…
claude Mar 15, 2026
c319b41
Fix Docker workflow missing semver/latest tags when called from relea…
claude Mar 15, 2026
fbde2de
Fix remaining PR review comments across packaging files
claude Mar 15, 2026
2eee9d9
Fix postinstall.sh: restart service after upgrade
claude Mar 15, 2026
36b4055
Add caveats block to Homebrew formula showing data directory
claude Mar 15, 2026
c64e14c
Merge branch 'dev' into claude/restore-package-managers-GlgBJ
pirate Mar 15, 2026
4cd50db
Add Debian and Homebrew package build automation (#1771)
pirate Mar 15, 2026
07dc880
Harden AddView config overrides to admin-only
pirate Feb 26, 2026
ecb1764
switch to external plugins
pirate Mar 15, 2026
4fa701f
Update abx dependencies and plugin test harness
pirate Mar 15, 2026
ea94029
Update deprecated GitHub Pages actions
pirate Mar 15, 2026
58f801c
Fix update orphan import and host-aware tests
pirate Mar 15, 2026
cc3e72b
Preserve tags for index-only adds
pirate Mar 15, 2026
c4d30a8
Restore index-only snapshot output links
pirate Mar 15, 2026
6b482c6
Restore top-level list command compatibility
pirate Mar 15, 2026
1f792d7
Restore CLI compat and plugin dependency handling
pirate Mar 15, 2026
760cf9d
Stabilize CI against expanded plugin surface
pirate Mar 15, 2026
68b9f75
Stabilize recursive crawl CI coverage
pirate Mar 15, 2026
5fb3709
Run recursive crawl tests to completion
pirate Mar 15, 2026
b62064f
Avoid recursive crawl timeout regressions
pirate Mar 15, 2026
bfc1e76
Update extractor tests for plugin output dirs
pirate Mar 15, 2026
31e883e
Stabilize plugin and crawl integration tests
pirate Mar 15, 2026
9ef8a1b
Stabilize secret-backed plugin CI
pirate Mar 15, 2026
50901e5
Align worker config propagation expectations
pirate Mar 15, 2026
941135d
Bound URL fixture archive wait
pirate Mar 15, 2026
82bfd7e
Filter binary hooks by allowed providers
pirate Mar 15, 2026
d4be507
Keep provider plugins enabled under whitelists
pirate Mar 15, 2026
47f540c
Resolve crawl provider dependencies lazily
pirate Mar 15, 2026
86fdc3b
Refresh worker config from resolved plugin installs
pirate Mar 15, 2026
7c55259
Update title HTML test for search export
pirate Mar 15, 2026
f92ca93
Skip puppeteer browser download during package install
pirate Mar 15, 2026
1fc860e
Remove legacy binary override coercion
pirate Mar 15, 2026
957387f
Fix plugin hook env and extractor retries
pirate Mar 15, 2026
2585ef5
Use npm package for readability extractor installs
pirate Mar 15, 2026
1d16038
Relax archive output readiness check
pirate Mar 15, 2026
0ac83c8
Wait for crawl hook records before advancing
pirate Mar 15, 2026
f0b2559
bump dep versions
pirate Mar 15, 2026
002de81
bump dep versions
pirate Mar 15, 2026
21a0a27
Remove 7 dead functions and 4 unused imports from hooks.py
pirate Mar 15, 2026
e598614
Avoid filesystem lookups in snapshot admin list
pirate Mar 16, 2026
7d42c6c
bump versions and fix docs
pirate Mar 16, 2026
70c9358
Improve scheduling, runtime paths, and API behavior
pirate Mar 16, 2026
f97725d
Mark version as 0.9.10rc0 (pre-release) per PEP 440
pirate Mar 16, 2026
934e026
fix lint
pirate Mar 16, 2026
5f0cfe5
add new persona tests
pirate Mar 16, 2026
311e434
Fix add CLI input handling and lint regressions
pirate Mar 16, 2026
f932054
add stricter locking around stage machine models
pirate Mar 16, 2026
95a105f
small fixes
pirate Mar 16, 2026
5381f75
Tighten API typing and add return values
pirate Mar 16, 2026
49436af
Tighten CLI and admin typing
pirate Mar 16, 2026
4756697
Use ruff pyright and ty for linting
pirate Mar 16, 2026
44cabac
fix typing
pirate Mar 16, 2026
3889eb4
Tighten config and admin typing
pirate Mar 16, 2026
bc21d4b
type and test fixes
pirate Mar 16, 2026
9de084d
bump package versions
pirate Mar 16, 2026
57e1187
cleanup archivebox tests
pirate Mar 16, 2026
6b0cfbc
revert docker to use pip again
pirate Mar 16, 2026
26f6d68
bump dep version
pirate Mar 16, 2026
ad41b15
Add configurable server security modes
pirate Mar 16, 2026
ee9ed44
bump dependencies
pirate Mar 21, 2026
c87079a
Refactor ArchiveBox onto abx-dl bus runner
pirate Mar 21, 2026
a6548df
Add configurable server security modes (#1773)
pirate Mar 23, 2026
f400a2c
WIP: checkpoint working tree before rebasing onto dev
pirate Mar 23, 2026
268856b
Preserve common config console handling after rebase
pirate Mar 23, 2026
b749b26
wip
pirate Mar 23, 2026
1d94645
test fixes
pirate Mar 23, 2026
8a25704
add harness tests
pirate Mar 23, 2026
f2c8114
tweak release script
pirate Mar 23, 2026
25f935b
split CrawlSetup into Install phase with new Binary + BinaryRequest e…
pirate Mar 23, 2026
e1eb569
split CrawlSetup into Install phase with new Binary + BinaryRequest e…
pirate Mar 23, 2026
3945011
Update CI uv handling and runner changes
pirate Mar 23, 2026
50286d3
Reuse cached binaries in archivebox runtime
pirate Mar 24, 2026
ed1ddbc
Fix CI workflows and migration tests
pirate Mar 24, 2026
68d9e30
Fix pytest basetemp handling in test harness
pirate Mar 24, 2026
80243ac
Fix archivebox CI regressions
pirate Mar 24, 2026
f3622d8
update working changes
pirate Mar 25, 2026
b40b5b8
chore: bump abx dependency minimums
pirate Apr 2, 2026
c8221d5
Remove local uv sources override
pirate Apr 2, 2026
3e7b83a
bump versions
pirate Apr 5, 2026
1c6b782
ignore outfiles
pirate Apr 5, 2026
4d66996
small fixes
pirate Apr 7, 2026
f126c6e
symlink lock_pkgs to setup monorepo script
pirate Apr 8, 2026
0b9b3b7
split tag editor issue
pirate Apr 8, 2026
2cc5a11
update dev instructions
pirate Apr 8, 2026
f128751
rename abxpkg
pirate Apr 17, 2026
ee56853
rename abx-pkg to abxpkg
pirate Apr 17, 2026
7d8c468
rename abx-pkg to abxpkg
pirate Apr 17, 2026
b68ff3e
fix monorepo script
pirate Apr 19, 2026
ec9c7c8
Drop Tag slug column and use URL-encoded names
claude Apr 20, 2026
b83e2de
Add TODO on tag export filename encoding
claude Apr 21, 2026
0041a2d
Sanitize tag export filenames via django.utils.text.slugify
claude Apr 21, 2026
2ea66d0
Move tag slug logic onto Tag.slug @property
claude Apr 21, 2026
7c3a3e0
Put tag slug back in JS download filename
claude Apr 21, 2026
42dc87f
Remove slug field from Tag model (#1789)
pirate Apr 21, 2026
2c1700a
Add static ArchiveBox landing page
pirate Apr 24, 2026
2b71c47
[codex] Add static ArchiveBox landing page (#1791)
pirate Apr 24, 2026
e013817
public site tweaks
pirate Apr 24, 2026
163e9bd
Rename publicsite Pages workflow
pirate Apr 24, 2026
ca7eeb7
Update publicsite configuration header
pirate Apr 24, 2026
abc987c
Update publicsite intro header
pirate Apr 24, 2026
35d630b
Tighten publicsite hero header
pirate Apr 24, 2026
fc3682a
Fix publicsite hero typo
pirate Apr 24, 2026
4804ad3
Update publicsite source header
pirate Apr 24, 2026
4fef401
Refine publicsite hero heading
pirate Apr 24, 2026
166a161
Align publicsite hero and nav with design system
pirate Apr 24, 2026
9c71acc
Add README shields to publicsite hero
pirate Apr 24, 2026
166dcd5
tweaks
pirate Apr 24, 2026
9b8f00f
more tweaks
pirate Apr 24, 2026
caba6e4
Link publicsite capability chips
pirate Apr 24, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
5 changes: 5 additions & 0 deletions .claude/settings.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
{
"enabledPlugins": {
"pyright-lsp@claude-plugins-official": true
}
}
51 changes: 51 additions & 0 deletions .claude/settings.local.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
{
"permissions": {
"allow": [
"Read(**)",
"Glob(**)",
"Grep(**)",
"Bash(python -m archivebox:*)",
"Bash(ls:*)",
"Bash(xargs:*)",
"Bash(python -c:*)",
"Bash(printf:*)",
"Bash(pkill:*)",
"Bash(python3:*)",
"Bash(sqlite3:*)",
"WebFetch(domain:github.com)",
"Bash(uv add:*)",
"Bash(mkdir:*)",
"Bash(chmod:*)",
"Bash(python -m forum_dl:*)",
"Bash(archivebox manage migrate:*)",
"Bash(cat:*)",
"Bash(python archivebox/plugins/pip/on_Dependency__install_using_pip_provider.py:*)",
"Bash(forum-dl:*)",
"Bash(pip uninstall:*)",
"Bash(python:*)",
"Bash(source .venv/bin/activate)",
"Bash(mv:*)",
"Bash(echo:*)",
"Bash(grep:*)",
"WebFetch(domain:python-statemachine.readthedocs.io)",
"Bash(./bin/run_plugin_tests.sh:*)",
"Bash(done)",
"Bash(coverage erase:*)",
"Bash(gh api:*)"
]
},
"hooks": {
"PreToolUse": [
{
"matcher": "Bash",
"hooks": [
{
"type": "command",
"command": "REPO_ROOT=$(git rev-parse --show-toplevel 2>/dev/null); if [ -n \"$REPO_ROOT\" ] && [ \"$PWD\" != \"$REPO_ROOT\" ]; then echo \"ERROR: Not in repo root ($REPO_ROOT). Current dir: $PWD\" >&2; exit 1; fi",
"statusMessage": "Checking working directory..."
}
]
}
]
}
}
49 changes: 44 additions & 5 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -1,6 +1,45 @@
output
__pycache__
.DS_Store
venv
.venv
data
._*
*.pyc
__pycache__/
.mypy_cache/
.pytest_cache/
.github/
.pdm-build/
.pdm-python
.eggs/
.git/
.vscode/
!.git/HEAD
!.git/refs/heads/*

venv/
.venv/
.venv-old/
.docker_venv/
.docker-venv/
node_modules/
chrome/
chromeprofile/
chrome_profile/

pdm.dev.lock
pdm.lock

docs/
build/
dist/
brew_dist/
deb_dist/
pip_dist/
assets/
docker/
website/
typings/

tmp/
data/
data*/
output/
index.sqlite3
index.sqlite3-wal
26 changes: 26 additions & 0 deletions .github/.readthedocs.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Read the Docs config for https://docs.archivebox.io
# https://docs.readthedocs.io/en/stable/config-file/v2.html

version: 2

submodules:
include: all
recursive: true

build:
os: ubuntu-22.04
tools:
python: "3.12"
#nodejs: "20" # not needed unless we need the full archivebox to run while building docs for some reason

sphinx:
configuration: docs/conf.py

formats:
- pdf
- epub

# https://docs.readthedocs.io/en/stable/guides/reproducible-builds.html
python:
install:
- requirements: docs/requirements.txt
39 changes: 38 additions & 1 deletion .github/CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1 +1,38 @@
Make sure check in with me first or confirm your desired features line up with our roadmap: https://github.com/pirate/ArchiveBox#roadmap
# Contribution Process

1. Confirm your desired features fit into our bigger project goals [Roadmap](https://github.com/pirate/ArchiveBox/wiki/Roadmap).
2. Open an issue with your planned implementation to discuss
3. Check in with me before starting development to make sure your work wont conflict with or duplicate existing work
4. Setup your dev environment, make some changes, and test using the test input files
5. Commit, push, and submit a PR and wait for review feedback
6. Have patience, don't abandon your PR! We love contributors but we all have day jobs and don't always have time to respond to notifications instantly. If you want a faster response, ping @theSquashSH on twitter or Patreon.

**Useful links:**

- https://github.com/ArchiveBox/ArchiveBox/issues
- https://github.com/ArchiveBox/ArchiveBox/pulls
- https://github.com/ArchiveBox/ArchiveBox/wiki/Roadmap
- https://github.com/ArchiveBox/ArchiveBox/wiki/Install#manual-setup

### Development Setup

```bash
git clone https://github.com/ArchiveBox/ArchiveBox
cd ArchiveBox
# Ideally do this in a virtualenv
pip install -e '.[dev]' # or use: pipenv install --dev
```

### Running Tests

```bash
./bin/lint.sh
./bin/test.sh
./bin/build.sh
```

For more common tasks see the `Development` section at the bottom of the README.

### Getting Help

Open issues on Github or message me https://sweeting.me/#contact.
2 changes: 2 additions & 0 deletions .github/FUNDING.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
github: ["ArchiveBox", "pirate"]
custom: ["https://donate.archivebox.io", "https://swag.archivebox.io"]
198 changes: 198 additions & 0 deletions .github/ISSUE_TEMPLATE/1-bug_report.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,198 @@
name: 🐞 Bug report
description: Report a bug or error you encountered in ArchiveBox
title: "Bug: ..."
assignees:
- pirate
type: 'Bug'
body:
- type: markdown
attributes:
value: |
*Please note:* it is normal to see errors occasionally for some extractors on some URLs (not every extractor will work on every type of page).
Please report archiving errors if you are seeing them *consistently across many URLs* or if they are *preventing you from using ArchiveBox*.

- type: textarea
id: description
attributes:
label: Provide a screenshot and describe the bug
description: |
Attach a screenshot and describe what the issue is, what you expected to happen, and if relevant, the *URLs you were trying to archive*.
placeholder: |
Got a bunch of 'singlefile was unable to archive this page' errors when trying to archive URLs from this site: https://example.com/xyz ...
I also tried to archive the same URLs using `singlefile` directly and some of them worked but not all of them. etc. ...
validations:
required: true

- type: textarea
id: steps_to_reproduce
attributes:
label: Steps to reproduce
description: Please provide the exact steps you took to trigger the issue (including any shell commands run, URLs visited, buttons clicked, etc.).
render: markdown
placeholder: |
1. Started ArchiveBox by running: `docker run -v $PWD:/data -p 8000:8000 archivebox/archivebox` in iTerm2
2. Went to the https://127.0.0.1:8000/add/ page in Google Chrome
3. Typed 'https://example.com/xyz' into the 'Add URL' input field
4. Clicked the 'Add+' button
5. Got a 500 error and saw the errors below in terminal
validations:
required: true

- type: textarea
id: logs
attributes:
label: Logs or errors
description: "Paste any terminal output, logs, or errors (check `data/logs/errors.log` as well)."
placeholder: |
╭─────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ [2024-11-02 19:54:28] ArchiveBox v0.8.6rc0: archivebox add https://example.com#1234567 │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────╯

[+] [2024-11-02 19:54:29] Adding 1 links to index (crawl depth=0)...
> Saved verbatim input to sources/1730577269-import.txt
> Parsed 1 URLs from input (Generic TXT)
...
render: shell
validations:
required: false

- type: textarea
id: version
attributes:
label: ArchiveBox Version
description: |
**REQUIRED:** Run the `archivebox version` command inside your collection dir and paste the *full output* here (*not just the version number*).
For Docker Compose run: `docker compose run archivebox version`
For plain Docker run: `docker run -v $PWD:/data archivebox/archivebox version`
render: shell
placeholder: |
0.8.6
ArchiveBox v0.8.6rc0 COMMIT_HASH=721427a BUILD_TIME=2024-10-21 12:57:02 1729515422
IN_DOCKER=False IN_QEMU=False ARCH=arm64 OS=Darwin PLATFORM=macOS-15.1-arm64-arm-64bit PYTHON=Cpython (venv)
EUID=502:20 UID=502:20 PUID=502:20 FS_UID=502:20 FS_PERMS=644 FS_ATOMIC=True FS_REMOTE=False
DEBUG=False IS_TTY=True SUDO=False ID=dfa11485:aa78ad45 SEARCH_BACKEND=ripgrep LDAP=False

Binary Dependencies:
√ python 3.14.0 venv_pip ~/.venv/bin/python
√ django 6.0 venv_pip ~/.venv/lib/python3.14/site-packages/django/__init__.py
√ sqlite 2.6.0 venv_pip ~/.venv/lib/python3.14/site-packages/django/db/backends/sqlite3/base.py
√ pip 24.3.1 venv_pip ~/.venv/bin/pip
...
validations:
required: true

- type: dropdown
id: install_method
validations:
required: true
attributes:
label: How did you install the version of ArchiveBox you are using?
multiple: false
options:
- pip
- apt
- brew
- nix
- Docker (or Podman/LXC/K8s/TrueNAS/Proxmox/etc)
- Other

- type: dropdown
id: operating_system
validations:
required: true
attributes:
label: What operating system are you running on?
description: |
Please note we are *unable to provide support for Windows users* unless you are using [Docker on Windows](https://github.com/ArchiveBox/archivebox#:~:text=windows%20without%20docker).
multiple: false
options:
- Linux (Ubuntu/Debian/Arch/Alpine/etc.)
- macOS (including Docker on macOS)
- BSD (FreeBSD/OpenBSD/NetBSD/etc.)
- Windows (including WSL, WSL2, Docker Desktop on Windows)
- Other

- type: checkboxes
id: filesystem
attributes:
label: What type of drive are you using to store your ArchiveBox data?
description: Are you using a [remote filesystem](https://github.com/ArchiveBox/ArchiveBox/wiki/Setting-Up-Storage#supported-remote-filesystems) or FUSE mount for `data/` OR `data/archive`?
options:
- label: "some of `data/` is on a local SSD or NVMe drive"
required: false
- label: "some of `data/` is on a spinning hard drive or external USB drive"
required: false
- label: "some of `data/` is on a network mount (e.g. NFS/SMB/Ceph/GlusterFS/etc.)"
required: false
- label: "some of `data/` is on a FUSE mount (e.g. SSHFS/RClone/S3/B2/Google Drive/Dropbox/etc.)"
required: false


- type: textarea
id: docker_compose_yml
attributes:
label: Docker Compose Configuration
description: "If using Docker Compose, please share your full `docker-compose.yml` file. If using plain Docker, paste the `docker run ...` command you use."
placeholder: |
services:
archivebox:
image: archivebox/archivebox:latest
ports:
- 8000:8000
volumes:
- ./data:/data
environment:
- ADMIN_USERNAME=admin
- ADMIN_PASSWORD=****<redact any passwords>****
- ALLOWED_HOSTS=*
- CSRF_TRUSTED_ORIGINS=https://archivebox.example.com
- PUBLIC_INDEX=True
- PUBLIC_SNAPSHOTS=True
- PUBLIC_ADD_VIEW=False
...

archivebox_scheduler:
image: archivebox/archivebox:latest
command: schedule --foreground --update --every=day
environment:
...

...
render: shell
validations:
required: false

- type: textarea
id: configuration
attributes:
label: ArchiveBox Configuration
description: "Please share your full `data/ArchiveBox.conf` file here."
render: shell
placeholder: |
[SERVER_CONFIG]
SECRET_KEY = "*********<redact any secrets/passwords>************"

WGET_RESTRICT_FILE_NAMES=windows
USE_SYSTEM_WGET=true
CHECK_SSL_VALIDITY=false
...
validations:
required: false


- type: markdown
attributes:
value: |
---

We strive to answer issues as quickly as possible, it usually takes us *about a ~week* to respond.
Make sure your `data/` is [**fully backed up**](https://github.com/ArchiveBox/ArchiveBox/wiki/Usage#disk-layout) before trying anything suggested here, **we are not responsible for data loss**.

In the meantime please consider:

- 💰 [Donating to support ArchiveBox open-source](https://github.com/ArchiveBox/ArchiveBox/wiki/Donations)
- 👨‍✈️ [Hiring us for corporate deployments](https://docs.monadical.com/s/archivebox-consulting-services) with professional support, custom feature development, and help with CAPTCHAs/rate-limits
- 🔍 [Searching the Documentation](https://docs.archivebox.io/) for answers to common questions
- 📚 Reading the [Troubleshooting Guide](https://github.com/ArchiveBox/ArchiveBox/wiki)
- ✨ Testing out a newer [`BETA` release](https://github.com/ArchiveBox/ArchiveBox/releases) (issues are often already fixed in our latest `BETA` releases)

Loading