Search Results for "text tools" - Page 3

Sort By:

Showing 282 open source projects for "text tools"

View related business solutions

Python Clear Filters & Widen Search

MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
Earn up to 16% annual interest with Nexo.
More flexibility. More control.

Generate interest, access liquidity without selling, and execute trades seamlessly. All in one platform. Geographic restrictions, eligibility, and terms apply.

Get started with Nexo.
1

PaperBanana

Extension of Google Research’s PaperBanana

PaperBanana is an open-source agentic framework designed to automatically generate publication-quality academic diagrams and statistical plots directly from text descriptions. The project focuses on helping researchers, educators, and data scientists transform conceptual descriptions of figures into structured visual outputs suitable for research papers, presentations, and technical reports. Instead of manually designing charts or diagrams using traditional visualization tools, users can describe the desired figure in natural language and allow the system to generate the visual representation automatically. ...

Downloads: 4 This Week

Last Update: 2026-03-09
See Project
2

Lagent

A lightweight framework for building LLM-based agents

Lagent is a lightweight open-source framework designed to help developers build autonomous agents powered by large language models. The framework provides tools and abstractions that allow language models to interact with external tools, execute tasks, and perform multi-step reasoning processes. Instead of using LLMs only for text generation, Lagent enables developers to transform models into agents capable of performing actions such as retrieving data, executing code, or interacting with APIs. ...

Downloads: 4 This Week

Last Update: 2026-03-06
See Project
3

Qwen3-ASR

Qwen3-ASR is an open-source series of ASR models

...This makes Qwen3-ASR suitable for voice-driven applications like AI assistants, dictation tools, speech analytics pipelines, and accessibility features, where accurate and fluid transcription is critical.

Downloads: 2 This Week

Last Update: 2026-02-09
See Project
4

Podcastfy.ai

Transforming Multimodal Content into Captivating Multilingual Audio

Podcastfy is an open-source Python package that transforms multi-modal content (text, images) into engaging, multi-lingual audio conversations using GenAI. Input content includes websites, PDFs, youtube videos as well as images. Unlike UI-based tools focused primarily on note-taking or research synthesis (e.g. NotebookLM), Podcastfy focuses on the programmatic and bespoke generation of engaging, conversational transcripts and audio from a multitude of multi-modal sources enabling customization and scale.

Downloads: 10 This Week

Last Update: 2024-11-16
See Project
Gemini 3 and 200+ AI Models on One Platform
Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build generative AI apps with Vertex AI. Switch between models without switching platforms.

Start Free
5

RAG Anything

RAG-Anything: All-in-One RAG Framework

RAG-Anything is an open-source unified framework that extends the Retrieval-Augmented Generation (RAG) paradigm to fully multimodal document and knowledge retrieval, enabling systems to ingest, parse, represent, and query rich content that includes text, images, tables, formulas, and other structured or visual elements. Traditional RAG systems are typically limited to text and cannot effectively work across heterogeneous document layouts, but RAG-Anything addresses this by modeling...

Downloads: 4 This Week

Last Update: 2026-03-24
See Project
6

Hugging Face - Speech To Speech

Open speech-to-speech models and pipelines by Hugging Face toolkit AI

This project from Hugging Face focuses on enabling direct speech-to-speech processing using modern machine learning models. It provides tools and reference implementations that allow audio input to be transformed into audio output without requiring an intermediate text representation. Hugging Face - Speech To Speech builds on recent advances in speech modeling, combining components such as speech recognition, translation, and synthesis into unified pipelines. It is designed to help researchers and developers experiment with multilingual and cross-lingual voice applications. ...

Downloads: 3 This Week

Last Update: 2026-03-18
See Project
7

Qwen-Image

Qwen-Image is a powerful image generation foundation model

Qwen-Image is a powerful 20-billion parameter foundation model designed for advanced image generation and precise editing, with a particular strength in complex text rendering across diverse languages, especially Chinese. Built on the MMDiT architecture, it achieves remarkable fidelity in integrating text seamlessly into images while preserving typographic details and layout coherence. The model excels not only in text rendering but also in a wide range of artistic styles, including...

1 Review

Downloads: 5 This Week

Last Update: 2026-02-10
See Project
8

PersonaPlex

PersonaPlex code

...PersonaPlex also supports persona and voice control, allowing developers to define the role and speaking style of the agent using text prompts and voice conditioning, making it suitable for applications like customized voice assistants, interactive character agents, or domain-specific conversational tools. Internally, it processes continuous audio streams in a hybrid input format so that speech understanding and generation occur jointly.

Downloads: 3 This Week

Last Update: 2026-03-02
See Project
9

DeepSeek VL

Towards Real-World Vision-Language Understanding

...The repository includes model weights (or pointers to them), evaluation metrics on standard vision + language benchmarks, and configuration or architecture files. It also supports inference tools for forwarding image + prompt through the model to produce text output. DeepSeek-VL is a predecessor to their newer VL2 model, and presumably shares core design philosophy but with earlier scaling, fewer enhancements, or capability tradeoffs.

Downloads: 12 This Week

Last Update: 2025-10-03
See Project
AI-generated apps that pass security review
Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.

Try Retool free
10

OpenAI-Compatible Edge-TTS API

Free, high-quality text-to-speech API endpoint to replace OpenAI

OpenAI-Compatible Edge-TTS API is a local, OpenAI-compatible text-to-speech API that uses edge-tts—Microsoft Edge’s online TTS service—as the backend. The project emulates the /v1/audio/speech endpoint used by OpenAI, so any client that can talk to the OpenAI TTS API can be redirected to this service with minimal changes. It exposes parameters for input text, voice selection, audio format, and playback speed, mirroring the OpenAI interface while mapping popular OpenAI voice names to equivalent Edge voices. ...

Downloads: 2 This Week

Last Update: 2025-11-28
See Project
11

LTX-2.3

Official Python inference and LoRA trainer package

LTX-2.3 is an open-source multimodal artificial intelligence foundation model developed by Lightricks for generating synchronized video and audio from prompts or other inputs. Unlike most earlier video generation systems that only produced silent clips, LTX-2 combines video and audio generation in a unified architecture capable of producing coherent audiovisual scenes. The model uses a diffusion-transformer-based architecture designed to generate high-fidelity visual frames while...

Downloads: 138 This Week

Last Update: 2026-03-30
See Project
12

Qwen3

Qwen3 is the large language model series developed by Qwen team

...The latest updated version, Qwen3-235B-A22B-Instruct-2507, features significant improvements in instruction-following, reasoning, knowledge coverage, and long-context understanding up to 256K tokens. It delivers higher quality and more helpful text generation across multiple languages and domains, including mathematics, coding, science, and tool usage. Various quantized versions, tools/pipelines provided for inference using quantized formats (e.g. GGUF, etc.). Coverage for many languages in training and usage, alignment with human preferences in open-ended tasks, etc.

1 Review

Downloads: 31 This Week

Last Update: 2026-01-09
See Project
13

AUTOMATIC1111 Stable Diffusion web UI

Stable Diffusion web UI

AUTOMATIC1111's stable-diffusion-webui is a powerful, user-friendly web interface built on the Gradio library that allows users to easily interact with Stable Diffusion models for AI-powered image generation. Supporting both text-to-image (txt2img) and image-to-image (img2img) generation, this open-source UI offers a rich feature set including inpainting, outpainting, attention control, and multiple advanced upscaling options. With a flexible installation process across Windows, Linux, and...

1 Review

Downloads: 313 This Week

Last Update: 2025-06-02
See Project
14

OmniVoice

High-Quality Voice Cloning TTS for 600+ Languages

The OmniVoice project is a cutting-edge multilingual text-to-speech system designed to generate high-quality speech across more than 600 languages. Built on a diffusion language model-style architecture, it combines scalability with strong performance, enabling both natural-sounding voice synthesis and efficient inference speeds. One of its most notable capabilities is zero-shot voice cloning, allowing users to replicate a speaker’s voice using only a short reference audio clip. In addition,...

Downloads: 1 This Week

Last Update: 2 days ago
See Project
15

NLP

Open source NLP guide with models, methods, and real use cases

NLP is an open source introductory resource for natural language processing, presented as a continuously updated book hosted on GitHub. It explains how machines process and understand human language, combining theory with practical examples. Its covers core NLP concepts such as text representation, feature extraction, and model evaluation, alongside hands-on implementations using tools like Word2Vec, TF-IDF, and FastText. It also introduces topic modeling with LDA, keyword extraction techniques, and document similarity methods. NLP extends into real-world applications, including sentiment analysis and text classification, helping readers connect concepts to use cases. ...

Downloads: 9 This Week

Last Update: 1 day ago
See Project
16

PyPDF

A pure-python PDF library capable of splitting, merging, cropping

pypdf is a pure Python library for working with PDF files, allowing developers to split, merge, rotate, encrypt, and extract content from PDFs. It’s an actively maintained fork of PyPDF2, improving performance, compatibility, and support for modern PDF standards. Suitable for both automation scripts and full-featured applications, pypdf handles PDFs without requiring external dependencies.

Downloads: 5 This Week

Last Update: 2 days ago
See Project
17

Agent Framework

Framework for building, orchestrating, and deploying AI agents

...It also includes components such as agent sessions for managing state, context providers for maintaining memory, and middleware for intercepting and extending agent behavior. Developers can integrate external tools and services so that agents can execute actions beyond text generation.

Downloads: 4 This Week

Last Update: 2 days ago
See Project
18

ACE-Step 1.5

The most powerful local music generation model

ACE-Step 1.5 is an advanced open-source foundation model for AI-driven music generation that pushes beyond traditional limitations in speed, musical coherence, and controllability by innovating in architecture and training design. It integrates cutting-edge generative techniques—such as diffusion-based synthesis combined with compressed autoencoders and lightweight transformer elements—to produce high-quality full-length music tracks with rapid inference times, capable of generating a...

Downloads: 94 This Week

Last Update: 2026-04-02
See Project
19

deepdoctection

A Repo For Document AI

DeepDoctection is a document AI framework that applies deep learning techniques to analyze and extract structured data from scanned documents, PDFs, and images. deepdoctection is a Python library that orchestrates document extraction and document layout analysis tasks using deep learning models. It does not implement models but enables you to build pipelines using highly acknowledged libraries for object detection, OCR and selected NLP tasks and provides an integrated frameworks for...

Downloads: 3 This Week

Last Update: 2026-04-09
See Project
20

Sygil WebUI

Stable Diffusion web UI

...It also supports jumping between workflows, such as sending an output directly into Image2Image for variations or into an “Image Lab” style area for enhancement and upscaling. Post-processing and enhancement are a major emphasis: the interface can route images through different upscalers and face-enhancement tools, helping users turn raw generations into cleaner, higher-resolution results.

Downloads: 1 This Week

Last Update: 2026-02-03
See Project
21

TADA

Open Source Speech Language Model

TADA is an open-source speech-language modeling framework designed to unify spoken audio and text representations within a single generative architecture. The system focuses on aligning speech and text streams using a dual-alignment mechanism that synchronizes the acoustic signal with its textual representation. By modeling both modalities together, the framework allows developers to build systems capable of generating, understanding, and transforming speech and language simultaneously. This...

Downloads: 0 This Week

Last Update: 2026-03-24
See Project
22

Unsloth Studio

Unified web UI for training and running open models locally

Unsloth Studio is a web-based interface for running and training AI models locally with a unified and user-friendly experience. It allows users to work with a wide range of models for text, audio, vision, embeddings, and more without relying heavily on cloud infrastructure. Built on top of the Unsloth framework, it focuses on high-performance training with reduced VRAM usage and faster speeds compared to traditional methods. The platform supports fine-tuning, pretraining, and reinforcement...

Downloads: 16 This Week

Last Update: 2026-04-08
See Project
23

Argilla

The open-source data curation platform for LLMs

Argilla is a production-ready framework for building and improving datasets for NLP projects. Deploy your own Argilla Server on Spaces with a few clicks. Use embeddings to find the most similar records with the UI. This feature uses vector search combined with traditional search (keyword and filter based). Argilla is free, open-source, and 100% compatible with major NLP libraries (Hugging Face transformers, spaCy, Stanford Stanza, Flair, etc.). In fact, you can use and combine your preferred...

Downloads: 10 This Week

Last Update: 2025-03-10
See Project
24

ARIS

Lightweight Markdown-only skills for autonomous ML research

ARIS is an experimental automation framework that leverages AI coding agents to perform continuous research and development tasks autonomously, even without active user supervision. The system is designed to run iterative cycles of research, coding, testing, and refinement, effectively simulating a “sleep mode” where productive work continues in the background. It integrates with AI tools such as Claude Code to generate solutions, analyze results, and improve outputs over time. The project...

Downloads: 1 This Week

Last Update: 5 days ago
See Project
25

HyperTools

A Python toolbox for gaining geometric insights

...It is built on top of matplotlib (for plotting), seaborn (for plot styling), and scikit-learn (for data manipulation). Functions for plotting high-dimensional datasets in 2/3D. Static and animated plots. Simple API for customizing plot styles. Set of powerful data manipulation tools including hyperalignment, k-means clustering, normalizing and more. Support for lists of Numpy arrays, Pandas dataframes, text or (mixed) lists. Applying topic models and other text vectorization methods to text data. HyperTools is designed to facilitate dimensionality reduction-based visual explorations of high-dimensional data. ...

Downloads: 1 This Week

Last Update: 2026-01-29
See Project