PaSa is an open-source “paper search agent” built around large language models (LLMs), designed to automate the process of academic literature retrieval with human-like decision making. Instead of simply translating a query into keywords and returning a flat list of matching papers, PaSa uses a dual-agent architecture (Crawler + Selector) that can iteratively search, read, analyze, and filter academic publications — simulating how a researcher might dig through citation networks, expand references, and evaluate relevance based on both metadata and content. Given a complex scholarly question (for example, “Which works focus on non-stationary reinforcement learning with UCB-based value methods?”), PaSa decomposes the task: the Crawler generates search queries, retrieves candidate papers (via search tools and citation expansion), then adds them to a “paper queue.” The Selector then reads abstracts or full text (depending on what’s available) and decides which papers are relevant.

Features

  • Dual-agent architecture (Crawler + Selector) — enabling iterative search, citation expansion, and content-based selection rather than simple keyword matching
  • Reinforcement-learning-trained workflows (on synthetic + real query datasets) to optimize recall and precision for complex, nuance-heavy academic queries
  • Support for automatic citation network traversal: starting from initial hits, the agent can expand references to discover related relevant works beyond the first search result set
  • End-to-end pipeline: from query → search → paper retrieval → reading & evaluation → filtered results — minimizing manual intervention
  • Public datasets (AutoScholarQuery for training; RealScholarQuery for evaluation), open-source code and pretrained models — enabling reproducible research or custom fine-tuning
  • Benchmarked performance showing strong improvements over standard search engines and naive LLM-based searches in recall metrics for real-world academic queries

Project Samples

Project Activity

See All Activity >

License

Apache License V2.0

Follow PaSa

PaSa Web Site

Other Useful Business Software
Full-stack observability with actually useful AI | Grafana Cloud Icon
Full-stack observability with actually useful AI | Grafana Cloud

Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
Create free account
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of PaSa!

Additional Project Details

Programming Language

Python

Related Categories

Python Artificial Intelligence Software

Registered

2025-12-01