← Content
AI · 4 min read · April 28, 2026

Efficient Rationale Retrieval via Student-Teacher Distillation

Rabtriever reduces computational cost of LLM-based document ranking by distilling cross-encoder knowledge into independent query-document encoders.

Source: arxiv/cs.LG · Teng Chen, Sheng Xu, Feixiang Guo, Xiaoyu Wang, Qingqing Gu, Hongyan Li, Luo Ji · open original ↗ ↗
Share: X LinkedIn

Rabtriever distills expensive cross-encoder rerankers into efficient dual-encoder retrievers using JEPA, cutting complexity from quadratic to linear.

  • Traditional rationale-based retrieval requires cross-encoding query-document pairs, creating high computational overhead.
  • Rabtriever trains a generative reranker as teacher, then distills its contextual knowledge into a student dual-encoder.
  • JEPA framework inserts a lightweight predictor between frozen LLM layers to project query embeddings into teacher-aligned space.
  • Auxiliary reverse-KL loss on logits improves on-policy sampling efficiency during distillation.
  • Reduces document-length complexity from quadratic to linear while maintaining comparable relevance judgments.
  • Tested on rationale tasks (empathetic conversation, robotic manipulation) and standard benchmarks (MS MARCO, BEIR).
  • Student model generalizes across diverse retrieval domains with minor accuracy loss versus teacher.

Frequently asked

  • Cross-encoders process each query-document pair together, creating quadratic complexity in document length. Rabtriever encodes queries and documents independently (dual-encoder), reducing complexity to linear. The student learns to approximate the teacher's cross-encoder reasoning without the computational overhead of joint encoding.

Related