59 projects for "paper" with 2 filters applied:

  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • Compliant and Reliable File Transfers Backed by Top Security Certifications Icon
    Compliant and Reliable File Transfers Backed by Top Security Certifications

    Cerberus FTP Server delivers SOC 2 Type II certified security and FIPS 140-2 validated encryption.

    Stop relying on non-certified, legacy file transfer tools that creak under the weight of modern security demands. Get full audit trails, advanced access controls and more supported by an award-winning team of experts. Start your free 25-day trial today.
    Start Free Trial
  • 1
    Feynman

    Feynman

    The open source AI research agent

    ...The system is built around a multi-agent architecture that includes roles such as researcher, reviewer, writer, and verifier, each responsible for a specific stage of the research pipeline. It supports advanced workflows like deep research investigations, paper replication, peer review simulation, and autonomous experimentation, enabling users to go beyond simple question answering into full research automation.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 2
    PaSa

    PaSa

    An advanced paper search agent powered by large language models

    ...Given a complex scholarly question (for example, “Which works focus on non-stationary reinforcement learning with UCB-based value methods?”), PaSa decomposes the task: the Crawler generates search queries, retrieves candidate papers (via search tools and citation expansion), then adds them to a “paper queue.” The Selector then reads abstracts or full text (depending on what’s available) and decides which papers are relevant.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Paperclip

    Paperclip

    Open-source orchestration for zero-human companies

    Paperclip is an open-source tool designed to help AI systems and developer tools access academic research papers through a standardized interface. The project implements a server based on the Model Context Protocol (MCP), a framework that allows large language models and AI agents to connect to external data sources and tools in a consistent way. By acting as a middleware layer, Paperclip aggregates multiple academic databases and exposes them through a single interface, allowing AI...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 4
    Writing AI Conference Papers

    Writing AI Conference Papers

    Writing AI Conference Papers: A Handbook for Beginners

    ...The project provides structured guidance on how to transform research ideas into complete manuscripts, covering topics such as defining the core contribution, organizing the paper structure, and refining technical details. It emphasizes practical advice and common pitfalls, making it especially useful for students or early-career researchers who may struggle with academic writing conventions. The guide breaks down the process into manageable stages, from identifying novelty and contributions to drafting, revising, and preparing submissions for conferences. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Auth0 B2B Essentials: SSO, MFA, and RBAC Built In Icon
    Auth0 B2B Essentials: SSO, MFA, and RBAC Built In

    Unlimited organizations, 3 enterprise SSO connections, role-based access control, and pro MFA included. Dev and prod tenants out of the box.

    Auth0's B2B Essentials plan gives you everything you need to ship secure multi-tenant apps. Unlimited orgs, enterprise SSO, RBAC, audit log streaming, and higher auth and API limits included. Add on M2M tokens, enterprise MFA, or additional SSO connections as you scale.
    Sign Up Free
  • 5
    PPTAgent

    PPTAgent

    PPTAgent: Generating and Evaluating Presentations

    ...The project includes both the generation agent and an evaluation framework, PPTEval, to score content quality, design, and coherence. The repository highlights the EMNLP 2025 paper and provides links to resources for replication and study. The approach reflects human presentation practice—plan, draft, then refine with edits—yielding more coherent decks than direct one-shot generation. Community interest and stars suggest strong uptake for research and tooling around presentation automation.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    CVPR 2025

    CVPR 2025

    Collection of CVPR 2025 papers and open source projects

    ...It organizes entries by topic areas such as detection, segmentation, generative models, 3D vision, multi-modal learning, and efficiency, so you can navigate the year’s output efficiently. Each paper entry typically includes a title, author list, and links to the paper PDF and official or third-party code repositories. The list frequently highlights benchmarks, leaderboards, or notable results so readers can assess impact at a glance. Because conference content evolves rapidly, the repository is updated as authors release code or refine readme instructions, keeping the collection timely. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    ML Ferret

    ML Ferret

    Refer and Ground Anything Anywhere at Any Granularity

    ...The core idea is a hybrid region representation that mixes discrete coordinates with continuous visual features, so the model can fluidly handle “any-form” referring while maintaining precise spatial localization. The repo presents the vision-language pipeline, model assets, and paper resources that show how Ferret answers questions, follows instructions, and returns grounded outputs rather than just text. In practice, this enables tasks like “find that small red icon next to the chart and describe it” where both the linguistic reference and the visual region are ambiguous without fine spatial reasoning.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    JiT

    JiT

    PyTorch implementation of JiT

    JiT is an open-source PyTorch implementation of a state-of-the-art image diffusion model designed around a minimalist yet powerful architecture for pixel-level generative modeling, based on the paper Back to Basics: Let Denoising Generative Models Denoise. Rather than predicting noise, JiT models directly predict clean image data, which the research suggests aligns better with the manifold structure of natural images and leads to stronger generative performance at high resolution. This implementation supports training on large datasets like ImageNet with configurable model variants, and practical scripts for setup, training, and evaluation on GPUs are included, leveraging PyTorch’s ecosystem for real-world experimentation. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 9
    MathModelAgent

    MathModelAgent

    An Agent Designed for Mathematical Modeling

    ...The project uses a multi-agent architecture where different specialized agents handle tasks such as problem interpretation, modeling design, programming implementation, and paper writing. Through integration with multiple large language models, the system can coordinate these components to generate structured modeling solutions and formatted research papers suitable for submission. The platform also includes a code execution environment that allows generated programs to be tested, corrected, and refined during the modeling workflow.
    Downloads: 3 This Week
    Last Update:
    See Project
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • 10
    AudioLM - Pytorch

    AudioLM - Pytorch

    Implementation of AudioLM audio generation model in Pytorch

    Implementation of AudioLM, a Language Modeling Approach to Audio Generation out of Google Research, in Pytorch It also extends the work for conditioning with classifier free guidance with T5. This allows for one to do text-to-audio or TTS, not offered in the paper. Yes, this means VALL-E can be trained from this repository. It is essentially the same. This repository now also contains a MIT licensed version of SoundStream. It is also compatible with EnCodec, however, be aware that it has a more restrictive non-commercial license, if you choose to use it.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    DeepSeek Prover V2

    DeepSeek Prover V2

    Advancing Formal Mathematical Reasoning via Reinforcement Learning

    ...The repo releases two model sizes (7B and 671B) and provides evaluation performance (e.g. pass rates on MiniF2F, results on ProverBench) as well as prompt / usage examples for proof generation in Lean 4. It also includes a PDF of the paper or project overview and sample formalization datasets. Because theorem proving is a cutting-edge area in LLM research, Prover-V2 is positioned as a pushing-forward effort in formal reasoning for LLMs.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 12
    RAG-Survey

    RAG-Survey

    Collecting awesome papers of RAG for AIGC

    ...Retrieval-augmented generation combines large language models with external knowledge retrieval systems to improve factual accuracy and contextual understanding. The repository functions as a curated catalog of research papers categorized according to a taxonomy proposed in a related survey paper on RAG methods. It organizes literature into multiple areas including foundational RAG models, architectural improvements, and application-specific implementations. Because the field is evolving rapidly, the repository is continuously updated with newly published research and emerging techniques. The resource is intended to help researchers and practitioners quickly explore the RAG ecosystem and understand the relationships between different approaches.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    NExT-GPT

    NExT-GPT

    Code and models for ICML 2024 paper, NExT-GPT

    NExT-GPT is an open-source research framework that implements an advanced multimodal large language model capable of understanding and generating content across multiple modalities. Unlike traditional models that primarily handle text, NExT-GPT supports input and output combinations involving text, images, video, and audio in a unified architecture. The system connects a large language model with multimodal encoders and diffusion-based decoders so it can interpret information from different...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    Paper2Slides

    Paper2Slides

    From Paper to Presentation in One Click

    Paper2Slides is an automation tool that converts research papers, reports, and other documents into polished slide decks and posters with minimal manual effort. It is designed to replace the repetitive work of turning dense technical documents into presentation-friendly structure by extracting key points, figures, and data into a coherent visual narrative. The system supports multiple input formats, so you can process PDFs and common office documents rather than being locked to a single file...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    AI Deadlines

    AI Deadlines

    AI conference deadline countdowns

    ...The project maintains a curated dataset of conferences that includes metadata such as submission deadlines, abstract deadlines, event dates, conference locations, and related information. Researchers and students use the platform to plan their paper submissions and manage academic schedules without manually tracking multiple conference announcements. The repository includes configuration files and data sources that allow contributors to add or update conferences through pull requests, enabling community-driven maintenance.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    CVPR 2026

    CVPR 2026

    Collection of CVPR 2026 Papers and Open Source Projects

    ...The repository acts as a continuously updated catalog of cutting-edge research across a wide range of topics including computer vision, multimodal AI, generative models, diffusion systems, autonomous driving, medical imaging, and remote sensing. Each entry typically links to the research paper as well as the public code repository associated with the work, allowing researchers and developers to quickly access reproducible implementations. The project serves as a centralized index that makes it easier for practitioners to explore the latest advances presented at major computer vision conferences. In addition to the current CVPR cycle, the repository also references related lists covering earlier conferences such as ECCV and ICCV, creating a broader archive of vision research.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    StreamSpeech

    StreamSpeech

    StreamSpeech is a seamless model for offline speech recognition

    StreamSpeech is an “all-in-one” speech model designed to perform offline and simultaneous speech recognition, speech translation, and speech synthesis within a single unified architecture. Developed as part of an ACL 2024 paper, it targets streaming and low-latency scenarios where intermediate results and final translations or synthetic speech must be produced continuously as audio is being received. The model supports eight tasks: offline ASR, speech-to-text translation, speech-to-speech translation, and TTS, as well as their streaming or simultaneous counterparts, all handled by the same underlying system. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    ArXiv MCP Server

    ArXiv MCP Server

    A Model Context Protocol server for searching and analyzing arXiv

    arxiv-mcp-server bridges AI assistants and the arXiv repository through a clean MCP interface, enabling search, metadata retrieval, and content access without bespoke scraping. With simple tools like “search” and “fetch,” an agent can find papers, pull abstracts, and download PDFs for downstream summarization or analysis. The project includes packaging and CI to publish to PyPI, plus tests and linting for reliability. Issue threads show feature requests such as extracting embedded LaTeX and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    VGGT

    VGGT

    [CVPR 2025 Best Paper Award] VGGT

    VGGT is a transformer-based framework aimed at unifying classic visual geometry tasks—such as depth estimation, camera pose recovery, point tracking, and correspondence—under a single model. Rather than training separate networks per task, it shares an encoder and leverages geometric heads/decoders to infer structure and motion from images or short clips. The design emphasizes consistent geometric reasoning: outputs from one head (e.g., correspondences or tracks) reinforce others (e.g., pose...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Make-A-Video - Pytorch (wip)

    Make-A-Video - Pytorch (wip)

    Implementation of Make-A-Video, new SOTA text to video generator

    ...The pseudo-3d convolutions isn't a new concept. It has been explored before in other contexts, say for protein contact prediction as "dimensional hybrid residual networks". The gist of the paper comes down to, take a SOTA text-to-image model (here they use DALL-E2, but the same learning points would easily apply to Imagen), make a few minor modifications for attention across time and other ways to skimp on the compute cost, do frame interpolation correctly, get a great video model out. Passing in images (if one were to pretrain on images first), both temporal convolution and attention will be automatically skipped. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    code-act

    code-act

    Official Repo for ICML 2024 paper

    code-act is a research framework for building intelligent language-model agents that interact with their environment through executable code actions. The system proposes a unified action representation where language models produce Python code that can be executed directly, allowing the model to interact with external tools and environments in a structured way. By integrating a Python interpreter with the agent architecture, the system enables the agent to execute code, observe the results,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Merchant of Venice
    Venice is a stock market trading programme that supports portfolio management, charting, technical analysis, paper trading and genetic programming. Venice runs in a graphical user interface with online help and has full documentation.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    Automated Interpretability

    Automated Interpretability

    Code for Language models can explain neurons in language models paper

    The automated-interpretability repository implements tools and pipelines for automatically generating, simulating, and scoring explanations of neuron (or latent feature) behavior in neural networks. Instead of relying purely on manual, ad hoc interpretability probing, this repo aims to scale interpretability by using algorithmic methods that produce candidate explanations and assess their quality. It includes a “neuron explainer” component that, given a target neuron or latent feature,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    GPT-2

    GPT-2

    Code for the paper Language Models are Unsupervised Multitask Learners

    This repository contains the code and model weights for GPT-2, a large-scale unsupervised language model described in the OpenAI paper “Language Models are Unsupervised Multitask Learners.” The intent is to provide a starting point for researchers and engineers to experiment with GPT-2: generate text, fine‐tune on custom datasets, explore model behavior, or study its internal phenomena. The repository includes scripts for sampling, training, downloading pre-trained models, and utilities for tokenization and model handling. ...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 25
    MGIE

    MGIE

    Guiding Instruction-based Image Editing via Multimodal Large Language

    MGIE—Guiding Instruction-based Image Editing—demonstrates how a multimodal LLM can parse natural-language editing instructions and then drive image transformations accordingly. The project focuses on making edits explainable and controllable: the model interprets text guidance, reasons over image content, and outputs edits aligned with user intent. It’s positioned as an ICLR 2024 Spotlight work, with code and references that show how to connect language planning to concrete image operations....
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next
Auth0 Logo