Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Artificial Intelligence Software
Search Results

Search Results for "input-output model" - Page 2

x

Sort By:

Relevance

Clear All Filters

OS

Linux 142
Windows 135
Mac 129
More...
BSD 59
ChromeOS 59
Mobile Operating Systems 3

Category

Artificial Intelligence 151
Software Development 9
Multimedia 4
Scientific/Engineering 3
Business 2
Education 1
System 1

License

OSI-Approved Open Source 138
Creative Commons Attribution License 2
GNU Free Documentation License 1

Translations

English 5

Programming Language

Python 151
Unix Shell 4
TypeScript 3
C++ 2
Prolog 1
More...
Rust 1

Status

Production/Stable 3
Beta 1

Showing 151 open source projects for "input-output model"

View related business solutions

Artificial Intelligence Python Clear Filters & Widen Search

$300 in Free Credit Across 150+ Cloud Services
VMs, containers, AI, databases, storage | build anything. No commitment to start.

Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale with Google Cloud.

Start Building Free
Fully Managed MySQL, PostgreSQL, and SQL Server
Automatic backups, patching, replication, and failover. Focus on your app, not your database.

Cloud SQL handles your database ops end to end. Migrate from on-prem or other clouds with free migration tools.

Try Free
1

Upsonic

The most reliable AI agent framework that supports MCP

Upsonic is a reliability-focused AI agent framework designed for real-world applications. It enables the development of trusted agent workflows within organizations by incorporating advanced reliability features, such as verification layers and output evaluation systems. The framework supports the Model Context Protocol (MCP), facilitating integration with various tools and enhancing agent capabilities.

Downloads: 4 This Week

Last Update: 2026-02-23
See Project
2

Qwen3-TTS

Qwen3-TTS is an open-source series of TTS models

Qwen3-TTS is an open-source text-to-speech (TTS) project built around the Qwen3 large language model family, focused on generating high-quality, natural-sounding speech from plain text input. It provides researchers and developers with tools to transform text into expressive, intelligible audio, supporting multiple languages and voice characteristics tuned for clarity and fluidity. The project includes pre-trained models and inference scripts that let users synthesize speech locally or integrate TTS into larger pipelines such as voice assistants, accessibility tools, or multimedia generation workflows. ...

Downloads: 28 This Week

Last Update: 2026-02-06
See Project
3

LiteLLM

lightweight package to simplify LLM API calls

Call all LLM APIs using the OpenAI format [Anthropic, Huggingface, Cohere, Azure OpenAI etc.] liteLLM supports streaming the model response back, pass stream=True to get a streaming iterator in response. Streaming is supported for OpenAI, Azure, Anthropic, and Huggingface models.

Downloads: 17 This Week

Last Update: 2 days ago
See Project
4

Stable Virtual Camera

Stable Virtual Camera: Generative View Synthesis with Diffusion Models

Stable Virtual Camera is a multi-view diffusion model developed by Stability AI that transforms 2D images into immersive 3D videos with realistic depth and perspective. Unlike traditional methods that require complex reconstruction or scene-specific optimization, this model allows users to generate novel views from any number of input images and define custom camera trajectories, enabling dynamic exploration of scenes.

Downloads: 1 This Week

Last Update: 2025-03-20
See Project
AI-generated apps that pass security review
Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.

Try Retool free
5

Mesh R-CNN

code for Mesh R-CNN, ICCV 2019

...The system combines 2D detection from Mask R-CNN with 3D reasoning modules that output full mesh reconstructions aligned with the input image. It has been evaluated on datasets such as Pix3D, where it demonstrates state-of-the-art performance in reconstructing real-world object geometry.

Downloads: 3 This Week

Last Update: 2 days ago
See Project
6

Oasis

Inference script for Oasis 500M

Open-Oasis provides inference code and released weights for Oasis 500M, an interactive world model that generates gameplay frames conditioned on user keyboard input. Instead of rendering a pre-built game world, the system produces the next visual state via a diffusion-transformer approach, effectively “imagining” the world response to your actions in real time. The project focuses on enabling action-conditional frame generation so developers can experiment with interactive, model-generated environments rather than static video generation alone. ...

Downloads: 1 This Week

Last Update: 2026-01-06
See Project
7

Step-Video-T2V

State-of-the-art (SoTA) text-to-video pre-trained model

...The model handles bilingual input (e.g. English and Chinese) thanks to dual encoders, and supports end-to-end text-to-video generation without requiring external assets. Its training and generation pipeline includes techniques like flow-matching, full 3D attention for temporal consistency, and fine-tuning approaches (e.g. video-based DPO) to improve fidelity and reduce artifacts.

Downloads: 2 This Week

Last Update: 2025-12-02
See Project
8

AIMET

AIMET is a library that provides advanced quantization and compression

Qualcomm Innovation Center (QuIC) is at the forefront of enabling low-power inference at the edge through its pioneering model-efficiency research. QuIC has a mission to help migrate the ecosystem toward fixed-point inference. With this goal, QuIC presents the AI Model Efficiency Toolkit (AIMET) - a library that provides advanced quantization and compression techniques for trained neural network models. AIMET enables neural networks to run more efficiently on fixed-point AI hardware...

Downloads: 2 This Week

Last Update: 5 days ago
See Project
9

Transformer Debugger

Tool for exploring and debugging transformer model behaviors

...It combines automated interpretability methods with sparse autoencoders, enabling researchers to analyze how specific neurons, attention heads, and latent features contribute to a model’s outputs. TDB allows users to intervene directly in the forward pass of a model and observe how such interventions change predictions, making it possible to answer questions like why a token was selected or why an attention head focused on a certain input. It automatically identifies and explains the most influential components, highlights activation patterns, and maps relationships across circuits within the model. ...

Downloads: 3 This Week

Last Update: 3 days ago
See Project
Go From Idea to Deployed AI App Fast
One platform to build, fine-tune, and deploy. No MLOps team required.

Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.

Try Free
10

Qwen3 Embedding

Designed for text embedding and ranking tasks

Qwen3-Embedding is a model series from the Qwen family designed specifically for text embedding and ranking tasks. It builds upon the Qwen3 base/dense models and offers several sizes (0.6B, 4B, 8B parameters), for both embedding and reranking, with high multilingual capability, long‐context understanding, and reasoning. It achieves state-of-the-art performance on benchmarks like MTEB (Multilingual Text Embedding Benchmark) and supports instruction-aware embedding (i.e. embedding task...

Downloads: 1 This Week

Last Update: 2025-09-30
See Project
11

HunyuanWorld-Voyager

RGBD video generation model conditioned on camera input

HunyuanWorld-Voyager is a next-generation video diffusion framework developed by Tencent-Hunyuan for generating world-consistent 3D scene videos from a single input image. By leveraging user-defined camera paths, it enables immersive scene exploration and supports controllable video synthesis with high realism. The system jointly produces aligned RGB and depth video sequences, making it directly applicable to 3D reconstruction tasks. At its core, Voyager integrates a world-consistent video diffusion model with an efficient long-range world exploration engine powered by auto-regressive inference. ...

Downloads: 17 This Week

Last Update: 2025-12-17
See Project
12

SimpleLLM

950 line, minimal, extensible LLM inference engine built from scratch

...Designed to run efficiently on high-end GPUs like NVIDIA H100 with support for models such as OpenAI/gpt-oss-120b, Simple-LLM implements continuous batching and event-driven inference loops to maximize hardware utilization and throughput. Its straightforward code structure allows anyone experimenting with custom kernels, new batching strategies, or inference optimizations to trace execution from input to output with minimal cognitive overhead.

Downloads: 0 This Week

Last Update: 2026-01-28
See Project
13

DeepSeek VL2

Mixture-of-Experts Vision-Language Models for Advanced Multimodal

DeepSeek-VL2 is DeepSeek’s vision + language multimodal model—essentially the next-gen successor to their first vision-language models. It combines image and text inputs into a unified embedding / reasoning space so that you can query with text and image jointly (e.g. “What’s going on in this scene?” or “Generate a caption appropriate to context”). The model supports both image understanding (vision tasks) and multimodal reasoning, and is likely used as a component in agent systems to...

Downloads: 5 This Week

Last Update: 2025-10-03
See Project
14

GELab-Zero

GUI Exploration Lab. One of the best GUI agent solutions

GELab-Zero is an open-source “GUI Agent” framework aiming to automate interactions with graphical user interfaces (GUIs), combining both the agent model and all supporting infrastructure — including inference, input orchestration, and GUI automation logic — in a plug-and-play package that runs locally, without cloud dependencies. The idea is to let developers or users harness an AI agent that can simulate clicking, typing, reading UI elements, and interacting with apps in a human-like way via the GUI, which can enable tasks like automated testing, scriptable workflows, or even autonomous usage of GUI-based applications. ...

Downloads: 1 This Week

Last Update: 2026-01-23
See Project
15

HunyuanVideo-Foley

Multimodal Diffusion with Representation Alignment

HunyuanVideo-Foley is a multimodal diffusion model from Tencent Hunyuan for high-fidelity Foley (sound effects) audio generation synchronized to video scenes. It is designed to generate audio that matches both visual content and textual semantic cues, for use in video production, film, advertising, games, etc. The model architecture aligns audio, video, and text representations to produce realistic synchronized soundtracks.

Downloads: 1 This Week

Last Update: 2025-09-28
See Project
16

nanocode

Minimal Claude Code alternative. Single Python file, zero dependencies

nanocode is a minimalist coding agent implementation designed as a compact alternative to Claude Code, packaged in a single Python file with no external dependencies and totaling around 250 lines of code. It implements a full agentic loop where the model can reason, decide when to use tools, execute those tools, and iterate until producing a final answer, making it useful for simple AI-assisted coding workflows. It includes a set of integrated tools such as read, write, edit, glob, grep, and bash that let the agent interact with the file system and shell commands directly from the terminal, and it keeps a conversation history with colored terminal output for readability. ...

Downloads: 0 This Week

Last Update: 2026-01-28
See Project
17

AWS MCP Servers

Helping you get the most out of AWS, wherever you use MCP

AWS MCP Servers are a collection of remotely hosted, fully-managed Model Context Protocol (MCP) servers by AWS, providing AI applications with real-time access to AWS documentation, API references, best practices, and infrastructure-management capabilities via natural-language workflows. An MCP Server is a lightweight program that exposes specific capabilities through the standardized Model Context Protocol. Host applications (such as chatbots, IDEs, and other AI tools) have MCP clients that...

Downloads: 8 This Week

Last Update: 6 days ago
See Project
18

LuxTTS

A high-quality rapid TTS voice cloning model

LuxTTS is an open-source text-to-speech (TTS) system focused on delivering high-quality, rapid voice synthesis and voice cloning that runs extremely fast and efficiently on consumer hardware. It implements a lightweight architecture based on ZipVoice and optimized sampling techniques so that it can generate speech at speeds up to roughly 150 times real-time on a single GPU and faster than real-time on CPU, all while producing audio at high fidelity with 48 kHz quality. The project supports...

Downloads: 8 This Week

Last Update: 2026-02-14
See Project
19

GLM-4.6V

GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning

GLM-4.6V represents the latest generation of the GLM-V family and marks a major step forward in multimodal AI by combining advanced vision-language understanding with native “tool-call” capabilities, long-context reasoning, and strong generalization across domains. Unlike many vision-language models that treat images and text separately or require intermediate conversions, GLM-4.6V allows inputs such as images, screenshots or document pages directly as part of its reasoning pipeline — and...

Downloads: 4 This Week

Last Update: 2026-01-27
See Project
20

HunyuanOCR

OCR expert VLM powered by Hunyuan's native multimodal architecture

HunyuanOCR is an open-source, end-to-end OCR (optical character recognition) Vision-Language Model (VLM) developed by Tencent‑Hunyuan. It’s designed to unify the entire OCR pipeline, detection, recognition, layout parsing, information extraction, translation, and even subtitle or structured output generation, into a single model inference instead of a cascade of separate tools. Despite being fairly lightweight (about 1 billion parameters), it delivers state-of-the-art performance across a wide variety of OCR tasks, outperforming many traditional OCR systems and even other multimodal models on benchmark suites. ...

Downloads: 2 This Week

Last Update: 2026-01-13
See Project
21

gpt-oss

gpt-oss-120b and gpt-oss-20b are two open-weight language models

gpt-oss is OpenAI’s open-weight family of large language models designed for powerful reasoning, agentic workflows, and versatile developer use cases. The series includes two main models: gpt-oss-120b, a 117-billion parameter model optimized for general-purpose, high-reasoning tasks that can run on a single H100 GPU, and gpt-oss-20b, a lighter 21-billion parameter model ideal for low-latency or specialized applications on smaller hardware. Both models use a native MXFP4 quantization for...

1 Review

Downloads: 10 This Week

Last Update: 2026-01-13
See Project
22

MetaVoice-1B

Foundational model for human-like, expressive TTS

MetaVoice — in the form of its source repository “metavoice-src” — is a large-scale text-to-speech (TTS) model. Specifically, the base model (MetaVoice-1B) uses around 1.2 billion parameters and has been trained on a massive dataset — reportedly around 100,000 hours of speech data. The goal is to provide human-like, expressive, and flexible TTS: able to generate natural-sounding speech that can handle diverse inputs and likely generalize over voice styles, intonation, prosody, and perhaps...

Downloads: 0 This Week

Last Update: 2025-11-28
See Project
23

Gemini-API

Reverse-engineered Python API for Google Gemini web app

Gemini-API is a community-created asynchronous Python wrapper for the web interface of Google’s Gemini models (formerly Bard). It is the result of reverse-engineering the Gemini web app and exposing its functionality through a programmatic API. This enables developers to incorporate Gemini into Python applications, scripts, bots, or tools without relying solely on official SDKs. The wrapper supports streaming responses, model selection, and handling of the web-based authentication/session...

Downloads: 6 This Week

Last Update: 2026-02-14
See Project
24

Ring

Ring is a reasoning MoE LLM provided and open-sourced by InclusionAI

Ring is a reasoning Mixture-of-Experts (MoE) large language model (LLM) developed by inclusionAI. It is built from or derived from Ling. Its design emphasizes reasoning, efficiency, and modular expert activation. In its “flash” variant (Ring-flash-2.0), it optimizes inference by activating only a subset of experts. It applies reinforcement learning/reasoning optimization techniques. Its architectures and training approaches are tuned to enable efficient and capable reasoning performance....

Downloads: 0 This Week

Last Update: 2025-09-30
See Project
25

UNO

A Universal Customization Method for Single and Multi Conditioning

UNO is a project by ByteDance introduced in 2025, titled “A Universal Customization Method for Both Single and Multi-Subject Conditioning.” It suggests a framework for image (or more general generative) modeling where the model can be conditioned either on a single subject or multiple subjects — which may correspond to generating or customizing images featuring specific people, styles, or objects, possibly with fine-grained control over subject identity or composition. Because the project is...

Downloads: 0 This Week

Last Update: 2025-12-02
See Project

Previous
1
You're on page 2
3
4
5
6
7
Next

Related Searches

gemini

tts

video ai

ocr

gpt-oss

chat gpt

mbrola voice for windows

zip

remove background

offline artificial intelligence\

Related Categories

Artificial Intelligence

Software Development

Multimedia

Scientific/Engineering

Business

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise

Thanks for helping keep SourceForge clean.

X

You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

Briefly describe the problem (required):

Upload screenshot of ad (required):

Select a file, or drag & drop file here.

✔

✘

Screenshot instructions:

Click URL instructions:
Right-click on the ad, choose "Copy Link", then paste here →
(This may not be possible with some types of ads)

More information about our ad policies

Ad destination/click URL: