Skip to content

feat: Megatron reranker backend, unified eval, and decoder-LM SPLADE#221

Open
zhichaoxu-shufe wants to merge 1 commit into
texttron:mainfrom
zhichaoxu-shufe:feat/megatron
Open

feat: Megatron reranker backend, unified eval, and decoder-LM SPLADE#221
zhichaoxu-shufe wants to merge 1 commit into
texttron:mainfrom
zhichaoxu-shufe:feat/megatron

Conversation

@zhichaoxu-shufe

Copy link
Copy Markdown

Adds three independent workstreams on top of Tevatron-v2 (additive, minimum-modification, backward compatible):

  • Megatron-Core training backend (TP/PP/EP parallelism, ZeRO-1 distributed optimizer, HF<->Megatron weight bridge, MoE-aware LoRA target groups) for dense and MoE rerankers; HF-format checkpoints.
  • Unified BEIR-15 evaluation with two schemas (local scoring + HTTP serving pool over HF/vLLM), plus offline teacher annotation for distillation.
  • Decoder-LM SPLADE (LACONIC): in-place bidirectional conversion via transformers 5.x create_bidirectional_mask (drops the llm2vec dependency), FLOPS-regularized SpladeTrainer, and the add_special_tokens BOS-sink fix.

See docs/CHANGES.md for the full per-file summary and docs/environments.md for the pinned dependency manifests.

Adds three independent workstreams on top of Tevatron-v2 (additive,
minimum-modification, backward compatible):

- Megatron-Core training backend (TP/PP/EP parallelism, ZeRO-1 distributed
  optimizer, HF<->Megatron weight bridge, MoE-aware LoRA target groups) for
  dense and MoE rerankers; HF-format checkpoints.
- Unified BEIR-15 evaluation with two schemas (local scoring + HTTP serving
  pool over HF/vLLM), plus offline teacher annotation for distillation.
- Decoder-LM SPLADE (LACONIC): in-place bidirectional conversion via
  transformers 5.x create_bidirectional_mask (drops the llm2vec dependency),
  FLOPS-regularized SpladeTrainer, and the add_special_tokens BOS-sink fix.

See docs/CHANGES.md for the full per-file summary and docs/environments.md
for the pinned dependency manifests.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant