Symbolic Mixture-of-Experts: Adaptive Skill-Based Routing for Heterogeneous Reasoning

A New Approach to Heterogeneous Reasoning: Symbolic Mixture-of-Experts

Combining pre-trained, specialized large language models (LLMs) offers promising possibilities for efficiently handling complex and diverse tasks. Traditional methods often select experts based on the task. However, this approach is often too coarse, as heterogeneous tasks may require different expertise for each instance. To enable adaptive, instance-based mixing of pre-trained LLM experts, Symbolic-MoE has been developed – a symbolic, text-based, and gradient-free Mixture-of-Experts framework.

Symbolic-MoE pursues a fine-tuned selection approach that prioritizes skills, such as algebra in mathematics or molecular biology in biomedical reasoning. Instead of selecting experts for an entire task, Symbolic-MoE dynamically recruits the most relevant expert LLMs for each individual instance, based on their respective strengths. Each selected expert then generates its own conclusion, resulting in k outputs from k experts. These are subsequently synthesized by an aggregator into a final, high-quality answer. The choice of aggregator is based on its ability to integrate diverse reasoning outputs.

Efficient Use of Resources through Batch Inference

The instance-based expert selection of Symbolic-MoE leads to significant performance improvements. However, a naive implementation can cause high computational overhead due to the constant loading and unloading of models. To solve this problem, Symbolic-MoE uses a batch inference strategy. Instances are grouped based on their assigned experts, so that each model only needs to be loaded once. This allows the integration of 16 expert models on a single GPU with a time expenditure comparable to, or even better than, previous multi-agent baselines that require four GPUs.

Convincing Results in Various Benchmarks

Comprehensive evaluations on various benchmarks (MMLU-Pro, GPQA, AIME, and MedMCQA) show that Symbolic-MoE outperforms strong LLMs like GPT4o-mini as well as multi-agent approaches. On average, Symbolic-MoE achieves an absolute improvement of 8.15% over the best multi-agent baseline. Furthermore, Symbolic-MoE eliminates the need for expensive multi-round discussions, thereby surpassing corresponding baselines with lower computational effort.

A Look into the Future

Symbolic-MoE represents a promising step towards scalable and adaptive AI systems. The ability to dynamically combine expert LLMs on an instance-based basis opens up new possibilities for complex reasoning in a wide variety of application areas. Future research could focus on optimizing the skill-based recruitment strategy as well as developing even more efficient aggregation methods.

Bibliography: - https://arxiv.org/abs/2503.05641 - https://deeplearn.org/arxiv/584453/symbolic-mixture-of-experts:-adaptive-skill-based-routing-for-heterogeneous-reasoning - https://arxiv.org/pdf/2503.05641? - http://paperreading.club/page?id=289966 - https://huggingface.co/papers - https://chatpaper.com/chatpaper/?id=2&date=1741536000&page=1 - https://iclr.cc/virtual/2025/papers.html - https://neurips.cc/virtual/2024/events/datasets-benchmarks-2024 - https://jmlr.org/tmlr/papers/ - https://github.com/AGI-Edgerunners/LLM-Agents-Papers