AI · 4 min read · April 28, 2026
Efficient Rationale Retrieval via Student-Teacher Distillation
Rabtriever reduces computational cost of LLM-based document ranking by distilling cross-encoder knowledge into independent query-document encoders.
Source: arxiv/cs.LG · Teng Chen, Sheng Xu, Feixiang Guo, Xiaoyu Wang, Qingqing Gu, Hongyan Li, Luo Ji · open original ↗ ↗
Rabtriever distills expensive cross-encoder rerankers into efficient dual-encoder retrievers using JEPA, cutting complexity from quadratic to linear.
- — Traditional rationale-based retrieval requires cross-encoding query-document pairs, creating high computational overhead.
- — Rabtriever trains a generative reranker as teacher, then distills its contextual knowledge into a student dual-encoder.
- — JEPA framework inserts a lightweight predictor between frozen LLM layers to project query embeddings into teacher-aligned space.
- — Auxiliary reverse-KL loss on logits improves on-policy sampling efficiency during distillation.
- — Reduces document-length complexity from quadratic to linear while maintaining comparable relevance judgments.
- — Tested on rationale tasks (empathetic conversation, robotic manipulation) and standard benchmarks (MS MARCO, BEIR).
- — Student model generalizes across diverse retrieval domains with minor accuracy loss versus teacher.
Frequently asked
- Cross-encoders process each query-document pair together, creating quadratic complexity in document length. Rabtriever encodes queries and documents independently (dual-encoder), reducing complexity to linear. The student learns to approximate the teacher's cross-encoder reasoning without the computational overhead of joint encoding.