Evaluate your LLM's response with Prometheus and GPT4
An Efficient Web-enhanced Question Answering System
The open source post-building layer for agents
Open-source evaluation toolkit of large multi-modality models (LMMs)
On the Structural Pruning of Large Language Models
Uncertainty Quantification for Language Models, is a Python package
Leaderboard Comparing LLM Performance at Producing Hallucinations
Code for Language models can explain neurons in language models paper
Beyond the Imitation Game collaborative benchmark for measuring