Improving Tool Retrieval Capabilities of Large Language Models

Large Language Models (LLMs) have made impressive progress in recent years. They can generate text, translate, and answer questions. A promising approach to expanding their capabilities is the integration of external tools. This allows LLMs to become agents that can solve complex tasks by accessing specialized functions, such as database queries, code execution, or access to specific APIs.

However, integrating tools also presents challenges. One of these is selecting the right tool for a given task. With a large number of available tools, it is crucial to efficiently identify the appropriate one. This is where Information Retrieval (IR) comes into play. IR models are designed to select the right tool from a large set of possibilities. This is particularly important because the context length of LLMs is limited and not all tools can be considered simultaneously.

Previous benchmarks for tool retrieval have often simplified this step by manually providing a small selection of relevant tools for each task. However, this does not reflect the reality where LLMs are confronted with a multitude of tools. To address this gap, ToolRet was developed, a benchmark for heterogeneous tool retrieval. ToolRet comprises 7,600 different retrieval tasks and a corpus of 43,000 tools compiled from existing datasets.

A study with ToolRet has yielded surprising results. Even IR models that perform well in conventional benchmarks show weaknesses with ToolRet. The low retrieval quality impairs the success rate of LLMs in task completion. This finding underscores the need to improve the tool retrieval capabilities of LLMs.

To optimize the performance of IR models in the context of tool retrieval, a comprehensive training dataset with over 200,000 instances was created. This dataset allows IR models to be specifically trained for the specific requirements of tool retrieval and significantly improves their ability to select the correct tool.

These developments are particularly relevant for companies like Mindverse, which specialize in the development of AI-powered content tools. Integrating tools into LLMs is an important step in expanding the functionality and utility of these models. By improving tool retrieval capabilities, LLMs can solve complex tasks even more effectively, thus increasing the value for users. Mindverse can leverage these advances to develop innovative solutions for chatbots, voicebots, AI search engines, and knowledge systems, thereby offering even greater value to its customers. The research results highlight the importance of specialized benchmarks and training data for the further development of LLMs in the field of tool retrieval.

Bibliographie: https://arxiv.org/abs/2503.01763 https://arxiv.org/html/2503.01763v1 https://chatpaper.com/chatpaper/?id=3&date=1741017600&page=1 https://papers.cool/arxiv/cs.IR http://paperreading.club/page?id=288524 https://www.chatpaper.com/chatpaper/pt/paper/116521 https://x.com/gm8xx8/status/1896800633292022035 https://x.com/_reachsumit?lang=de https://twitter.com/gm8xx8/status/1896799732561055994 https://www.linkedin.com/posts/zhaochun-ren-460491296_250301763-activity-7302809681989623810-wpDI

Improving Tool Retrieval Capabilities of Large Language Models

Top post

Improving Tool Retrieval Capabilities of Large Language Models

Related blog

Multi-Turn Jailbreaks and Defenses: Enhancing LLM Security

Off-Policy Learning Enhances Reasoning Abilities in AI Models

SphereDiff Generates Seamless 360° Panoramas Without Finetuning