Page 2 | input free download

Showing 506 open source projects for "input"

View related business solutions

Python Clear Filters & Widen Search

Build on Google Cloud with $300 in Free Credit
New to Google Cloud? Get $300 in free credit to explore Compute Engine, BigQuery, Cloud Run, Vertex AI, and 150+ other products.

Start your next project with $300 in free Google Cloud credit. Spin up VMs, run containers, query exabytes in BigQuery, or build AI apps with Vertex AI and Gemini. Once your credits are used, keep building with 20+ products with free monthly usage, including Compute Engine, Cloud Storage, GKE, and Cloud Run functions. Sign up to start building right away.

Start Free Trial
Go from Data Warehouse to Data and AI platform with BigQuery
Build, train, and run ML models with simple SQL. Automate data prep, analysis, and predictions with built-in AI assistance from Gemini.

BigQuery is more than a data warehouse—it's an autonomous data-to-AI platform. Use familiar SQL to train ML models, run time-series forecasts, and generate AI-powered insights with native Gemini integration. Built-in agents handle data engineering and data science workflows automatically. Get $300 in free credit, query 1 TB, and store 10 GB free monthly.

Try BigQuery Free
1

gensim

Topic Modelling for Humans

Gensim is a Python library for topic modeling, document indexing, and similarity retrieval with large corpora. The target audience is the natural language processing (NLP) and information retrieval (IR) community.

Downloads: 1 This Week

Last Update: 2025-10-16
See Project
2

MoneyPrinterTurbo

Generate short videos with one click using AI LLM

MoneyPrinterTurbo is an AI-driven tool that enables users to generate high-definition short videos with minimal input. By providing a topic or keyword, the system automatically creates video scripts, sources relevant media assets, adds subtitles, and incorporates background music, resulting in a polished video ready for distribution.

Downloads: 22 This Week

Last Update: 2025-05-10
See Project
3

Kivy

Innovative user interfaces made easy

Kivy is an open source, cross-platform UI framework that lets you develop applications that make use of innovative, multi-touch user interfaces. Written in Python with a graphics engine built over OpenGL ES 2, Kivy supports various input devices and protocols, and gives you access to over 20 widgets that are all highly extensible and have built-in multi-touch support. You can run the same codebase on Mac, Windows, Linux, Android and iOS. Kivy is 100% free and open source with a professionally developed and used toolkit, as well as a stable framework and well-documented API, so you can be confident in using it in a commercial product.

Downloads: 55 This Week

Last Update: 2024-12-26
See Project
4

Pixelization

Stable-diffusion-webui-pixelization

This is a specialized extension for the popular Stable Diffusion Web UI (AUTOMATIC1111) that focuses on converting or “pixelizing” images into a pixel-art aesthetic. It's designed as a plugin you install into the Web UI so that in the “Extras” or “Pixelization” tab you can drag in an input image and produce a stylized, block-based version with control over cell size, color depth, and segmentation. The extension uses pre-trained models and optionally can co-operate with the Web UI’s other features (image-to-image, prompt-based generation) so you can combine pixelization with generative workflows. For digital art, game assets, or retro aesthetic workflows, this offers a fast path from photo or high-res asset to stylized tiles or sprites. ...

Downloads: 1 This Week

Last Update: 2025-10-21
See Project
Build AI Apps with Gemini 3 on Vertex AI
Access Google’s most capable multimodal models. Train, test, and deploy AI with 200+ foundation models on one platform.

Vertex AI gives developers access to Gemini 3—Google’s most advanced reasoning and coding model—plus 200+ foundation models including Claude, Llama, and Gemma. Build generative AI apps with Vertex AI Studio, customize with fine-tuning, and deploy to production with enterprise-grade MLOps. New customers get $300 in free credits.

Try Vertex AI Free
5

Qwen-Audio

Chat & pretrained large audio language model proposed by Alibaba Cloud

Qwen-Audio is a large audio-language model developed by Alibaba Cloud, built to accept various types of audio input (speech, natural sounds, music, singing) along with text input, and output text. There is also an instruction-tuned version called Qwen-Audio-Chat which supports conversational interaction (multi-round), audio + text input, creative tasks and reasoning over audio. It uses multi-task training over many different audio tasks (30+), and achieves strong multi-benchmarks performance without task-specific fine‐tuning. ...

Downloads: 0 This Week

Last Update: 2025-09-23
See Project
6

Omnara

Talk to Your AI Agents from Anywhere

Omnara is an open-source agent control platform that empowers developers to turn autonomous AI tools (e.g., Claude Code, Cursor, GitHub Copilot) into collaborative teammates by offering real-time dashboards, push notifications, and remote guidance across terminals, web, and mobile. Omnara transforms your AI agents (Claude Code, Codex CLI, n8n, and more) from silent workers into communicative teammates. Get real-time visibility into what your agents are doing, and respond to their questions...

Downloads: 4 This Week

Last Update: 2025-11-09
See Project
7

Stable Virtual Camera

Stable Virtual Camera: Generative View Synthesis with Diffusion Models

Stable Virtual Camera is a multi-view diffusion model developed by Stability AI that transforms 2D images into immersive 3D videos with realistic depth and perspective. Unlike traditional methods that require complex reconstruction or scene-specific optimization, this model allows users to generate novel views from any number of input images and define custom camera trajectories, enabling dynamic exploration of scenes. It supports various aspect ratios and can produce 3D-consistent videos up to 1,000 frames, making it a versatile tool for creators seeking to enhance visual storytelling.

Downloads: 4 This Week

Last Update: 2025-03-20
See Project
8

jsondiff

Diff JSON and JSON-like structures in Python

Diff JSON and JSON-like structures in Python.

Downloads: 4 This Week

Last Update: 2024-08-29
See Project
9

MetaGPT

The Multi-Agent Framework

The Multi-Agent Framework: Given one line Requirement, return PRD, Design, Tasks, Repo. Assign different roles to GPTs to form a collaborative software entity for complex tasks. MetaGPT takes a one-line requirement as input and outputs user stories / competitive analysis/requirements/data structures / APIs / documents, etc. Internally, MetaGPT includes product managers/architects/project managers/engineers. It provides the entire process of a software company along with carefully orchestrated SOPs.

Downloads: 5 This Week

Last Update: 2025-03-02
See Project
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
10

WhisperLive

A nearly-live implementation of OpenAI's Whisper

...The project supports multiple inference backends, including Faster-Whisper, NVIDIA TensorRT, and OpenVINO, allowing you to target GPUs and different CPU architectures efficiently. It can handle microphone input, pre-recorded audio files, and network streams such as RTSP and HLS, making it flexible for live events, monitoring, or accessibility workflows. Configuration options let you control the number of clients, maximum connection time, and threading behavior so the server can be tuned for different deployment environments. On the client side, you can set the language, whether to translate into English, model size, voice activity detection, and output recording behavior.

Downloads: 6 This Week

Last Update: 2025-11-28
See Project
11

PDFSticher

Code repository for PDFStitcher, a utility to stitch together PDFs

...Since version 0.4, it is also possible to select layers for inclusion/exclusion in the final output. Additionally, line properties can be modified for each layer if the input PDF is compatible.

Downloads: 34 This Week

Last Update: 2025-06-26
See Project
12

MeloTTS

High-quality multi-lingual text-to-speech library by MyShell.ai

MeloTTS is an open-source text-to-speech (TTS) system that generates natural-sounding speech from text input. It utilizes advanced machine-learning models to produce high-quality audio outputs.

Downloads: 5 This Week

Last Update: 2025-01-06
See Project
13

Barfi

A Python visual Flow Based Programming library

...A schema is built using barfi.Blocks. Then the schema is executed with barfi.ComputeEngine. Each barfi.Block has some properties that enable the FBP and schema building. Firstly, each Block has Input and Output interfaces that link to other Blocks. Each Block can carry an executable function, that is specified by the user. This function can access/get data from the Input interface, perform computations or calculations, and set the Output interface. In general, Barfi is an abstraction of Graphical Programming, Flow-Based Programming, or Node programming. ...

Downloads: 2 This Week

Last Update: 2025-01-06
See Project
14

Fun Audio Chat

Large Audio Language Model built for natural interactions

...It combines speech recognition, audio processing, and AI generation so users can speak simply and receive spoken replies, enabling applications such as virtual assistants, voice bots, and hands-free chat interfaces. The system supports dynamic audio input and output, meaning it can handle different voices, tones, and conversational contexts without forcing users into typed interactions. With real-time streaming, it minimizes latency and delivers responses quickly, making it suitable for applications where responsiveness matters, such as interactive demos, accessibility tools, and conversational games.

Downloads: 3 This Week

Last Update: 2026-02-03
See Project
15

Unredact

A simple tool for reading in poorly redacted documents

...Unlike traditional optical character recognition (OCR), which only reads visible text, Unredact focuses on inferring missing content where redaction has been applied by analyzing surrounding context, font characteristics, and linguistic patterns to produce candidate reconstructions. It accepts a variety of input formats, automatically identifies redacted regions, and then generates text suggestions that are presented alongside visual overlays so users can choose or refine outputs.

Downloads: 34 This Week

Last Update: 2026-02-03
See Project
16

SSRFmap

Automatic SSRF fuzzer and exploitation tool

SSRFmap is a specialized security tool designed to automate the detection and exploitation of Server Side Request Forgery (SSRF) vulnerabilities. It takes as input a Burp request file and a user-specified parameter to fuzz, enabling you to fast-track the identification of SSRF attack surfaces. It includes multiple exploitation “modules” for common SSRF-based attacks or pivoting techniques, such as DNS zone transfers, MySQL/Postgres command execution, Docker API info leaks, and network scans. ...

Downloads: 2 This Week

Last Update: 2025-11-04
See Project
17

RenderCV

LaTeX CV generator from a YAML/JSON input file

RenderCV is a LaTeX CV/resume framework. It allows you to create a high-quality CV as a PDF from a YAML file with full Markdown syntax support and complete control over the LaTeX code. RenderCV offers built-in LaTeX and Markdown templates ready to produce high-quality CVs. However, the templates are entirely arbitrary and can easily be updated to leverage RenderCV's capabilities with your custom CV themes.

Downloads: 18 This Week

Last Update: 2025-12-23
See Project
18

Weights and Biases

Tool for visualizing and tracking your machine learning experiments

...Capture dataset versions with W&B Artifacts to identify how changing data affects your resulting models. Reproduce any model, with saved code, hyperparameters, launch commands, input data, and resulting model weights. Set wandb.config once at the beginning of your script to save your hyperparameters, input settings (like dataset name or model type), and any other independent variables for your experiments. This is useful for analyzing your experiments and reproducing your work in the future. Setting configs also allows you to visualize the relationships between features of your model architecture or data pipeline and model performance.

Downloads: 3 This Week

Last Update: 7 days ago
See Project
19

Step-Video-T2V

State-of-the-art (SoTA) text-to-video pre-trained model

Step-Video-T2V is a state-of-the-art text-to-video foundation model developed to generate videos from natural-language prompts; its 30B-parameter architecture is designed to produce coherent, temporally extended video sequences — up to around 204 frames — based on input text. Under the hood it uses a compressed latent representation (a Video-VAE) to reduce spatial and temporal redundancy, and a denoising diffusion (or similar) process over that latent space to generate smooth, plausible motion and visuals. The model handles bilingual input (e.g. English and Chinese) thanks to dual encoders, and supports end-to-end text-to-video generation without requiring external assets. ...

Downloads: 4 This Week

Last Update: 2025-12-02
See Project
20

FastVLM

This repository contains the official implementation of FastVLM

FastVLM is an efficiency-focused vision-language modeling stack that introduces FastViTHD, a hybrid vision encoder engineered to emit fewer visual tokens and slash encoding time, especially for high-resolution images. Instead of elaborate pruning stages, the design trades off resolution and token count through input scaling, simplifying the pipeline while maintaining strong accuracy. Reported results highlight dramatic speedups in time-to-first-token and competitive quality versus contemporary open VLMs, including comparisons across small and larger variants. The repository documents model variants, showcases head-to-head numbers against known baselines, and explains how the encoder integrates with common LLM backbones. ...

Downloads: 1 This Week

Last Update: 2025-10-08
See Project
21

Wfuzz

Web application fuzzer

...Wfuzz it is based on a simple concept: it replaces any reference to the FUZZ keyword by the value of a given payload. A payload in Wfuzz is a source of data. This simple concept allows any input to be injected in any field of an HTTP request, allowing to perform complex web security attacks in different web application components such as: parameters, authentication, forms, directories/files, headers, etc.

Downloads: 18 This Week

Last Update: 2026-01-21
See Project
22

DeepSeek-V3.2-Exp

An experimental version of DeepSeek model

DeepSeek-V3.2-Exp is an experimental release of the DeepSeek model family, intended as a stepping stone toward the next generation architecture. The key innovation in this version is DeepSeek Sparse Attention (DSA), a sparse attention mechanism that aims to optimize training and inference efficiency in long-context settings without degrading output quality. According to the authors, they aligned the training setup of V3.2-Exp with V3.1-Terminus so that benchmark results remain largely...

Downloads: 30 This Week

Last Update: 2025-11-18
See Project
23

CogVLM2

GPT4V-level open-source multi-modal model based on Llama3-8B

...Built on Meta-Llama-3-8B-Instruct, CogVLM2 significantly improves over its predecessor by providing stronger performance across multimodal benchmarks such as TextVQA, DocVQA, and ChartQA, while introducing extended context length support of up to 8K tokens and high-resolution image input up to 1344×1344. The series includes models for both image understanding and video understanding, with CogVLM2-Video supporting up to 1-minute videos by analyzing keyframes. It supports bilingual interaction (Chinese and English) and has open-source versions optimized for dialogue and video comprehension. Notably, the Int4 quantized version allows efficient inference on GPUs with only 16GB of memory. ...

Downloads: 1 This Week

Last Update: 2026-02-12
See Project
24

CogAgent

An open sourced end-to-end VLM-based GUI Agent

CogAgent is a 9B-parameter bilingual vision-language GUI agent model based on GLM-4V-9B, trained with staged data curation, optimization, and strategy upgrades to improve perception, action prediction, and generalization across tasks. It focuses on operating real user interfaces from screenshots plus text, and follows a strict input–output format that returns structured actions, grounded operations, and optional sensitivity annotations. The model is designed for agent-style execution rather than freeform chat, maintaining a continuous execution history across steps while requiring a fresh session for each new task. Inference supports BF16 on NVIDIA GPUs, with optional INT8 and INT4 modes available but with noted performance loss at INT4; example CLIs and a web demo illustrate bounding-box outputs and operation categories.

Downloads: 1 This Week

Last Update: 2026-02-12
See Project
25

Qwen3 Embedding

Designed for text embedding and ranking tasks

Qwen3-Embedding is a model series from the Qwen family designed specifically for text embedding and ranking tasks. It builds upon the Qwen3 base/dense models and offers several sizes (0.6B, 4B, 8B parameters), for both embedding and reranking, with high multilingual capability, long‐context understanding, and reasoning. It achieves state-of-the-art performance on benchmarks like MTEB (Multilingual Text Embedding Benchmark) and supports instruction-aware embedding (i.e. embedding task...

Downloads: 1 This Week

Last Update: 2025-09-30
See Project