Welcome to TAIL!
Features
Easy to customize
TAIL helps you generate benchmarks on your own documents (Patents, Papers, Financial Reports, anything you are interested in). It allows you to create test examples of any context length and questions at any depth you desire.
Realistic and natural
Unlike the needle-in-a-haystack test, TAIL generate questions based on information from your own document, instead of inserting a piece of new infomation, making the benchmark more realistic and natural.
Quality assured
TAIL utilizes multiple quality assurance measures, including RAG-based filtering and rigorous quality checks, to eliminate subpar QAs and deliver a high-caliber benchmark.
Ready-to-use
TAIL integrates an out-of-the-box evaluation module that enables users to easily evaluate commercial LLMs via API calls and open-source LLMs via vLLM on the generated benchmarks.