Skip to content

bundle-adjuster/deep-lfm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

43 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

deep-lfm

A deep line-feature matcher with a novel Expectation–Maximization (EM) assignment head, built on a frozen DeepLSD detector front-end and designed to feed a Structure-from-Motion (SfM) pipeline.

Most learned matchers (SuperGlue, LightGlue, GlueStick) end in an optimal-transport / dual-softmax assignment layer driven by appearance alone, deferring geometry to a downstream RANSAC. deep-lfm replaces that head with a differentiable EM module that alternates between estimating soft match responsibilities (E-step) and a two-view geometric model from points sampled along the lines (M-step). The converged responsibilities give calibrated, geometrically-consistent confidence — exactly what SfM needs — and the estimated geometry is a free two-view initializer.

Research repository, trained from scratch. See docs/DESIGN.md (design), docs/ROADMAP.md (phased status), docs/RESULTS.md (validated results), and docs/SETUP.md (real-data onboarding).

Status

Phases 0–5 implemented and tested (38 tests). The full pipeline — detector → encoder → attention backbone → EM head → eval — runs end-to-end with a mock detector and synthetic data. Real-data training needs MegaDepth/ScanNet + DeepLSD weights wired in — follow docs/SETUP.md.

Headline (training-free ablation, isolating the head): under match-ambiguous appearance, OT F1 0.41 → EM 1.00, and the EM head returns a relative pose for free. See docs/RESULTS.md.

Quickstart

pip install -e ".[dev,train,viz]"
python -m pytest -q                       # 38 tests
python scripts/train.py --config dev      # synthetic + mock detector
python scripts/eval.py --mode ablation    # OT-vs-EM table

Key ideas

  • DeepLSD as a frozen line detector (optionally fine-tuned later). It provides geometry only; descriptors are learned here.
  • Lines represented by K points sampled along the segment, pooled from a learned dense feature map — robust to fragmentation and the substrate for epipolar geometric scoring.
  • Attention backbone (GlueStick-style self/cross attention) → context-aware line descriptors.
  • EM assignment head: E-step responsibilities mix appearance + epipolar likelihood + an outlier/no-match state (with inner Sinkhorn for the one-to-one constraint); M-step re-estimates geometry by responsibility-weighted, differentiable solving. Unrolled and end-to-end trainable.
  • SfM-aware outputs: confidence, multi-view tracks, two-view geometry, COLMAP-compatible export.

Follow-ups / TODO

Ordered by expected payoff for closing the held-out gap to GlueStick (ETH3D AP 72.6; we are at 33.6 after homography pretraining). See docs/RESULTS.md for the data behind each.

  • Close the domain gap (biggest lever). Held-out ETH3D overfits by ~10k because training is ScanNet-indoor only. Train/fine-tune on MegaDepth (outdoor, matches ETH3D) — fill the stubbed data/megadepth.py build_manifest (COLMAP poses + .h5 depth); ~199 GB download.
  • Harder homography pretext. The current pretext saturates by ~1k steps (F1 0.99) yet still gave +2–3 AP. Stronger warps + photometric augmentation and a larger/more diverse image set (COCO train, Oxford-Paris) should lift more.
  • Pretrained descriptors. Bootstrap the line encoder from a pretrained dense backbone (SuperPoint/DISK) instead of learning the CNN from scratch.
  • Fix overfitting directly. Augmentation, weight-decay/dropout sweep, early-stop at the held-out peak (~10k on full ScanNet).
  • Matched-protocol GlueStick comparison. Run GlueStick through our ETH3D harness (or ours under glue-factory's protocol) for a true head-to-head — current AP is indicative only (different detector/GT pipeline).
  • Robust M-step (RANSAC/IRLS). Pose AUC collapses under heavy mismatching (no robust estimator today) — see Limitations in docs/RESULTS.md.
  • Orchestration resilience. The overnight multi-stage loop's wakeup chain broke after the fine-tune; prefer a single driver script for long pipelines.
  • (Stretch) Joint points+lines. GlueStick's core strength; large scope and partially dilutes the EM-on-lines thesis — evaluate before committing.

License

MIT (this repo). DeepLSD and any pretrained weights are governed by their own licenses — see docs/RELATED.md.

About

Deep line-feature matcher with a differentiable EM assignment head (DeepLSD front-end, SfM target)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors