Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Search Results

Search Results for "open source png text" - Page 3

x

Sort By:

Relevance

Clear All Filters

OS

Linux 1,406
Windows 1,337
Mac 1,118
More...
BSD 705
ChromeOS 557
Desktop Operating Systems 36
Mobile Operating Systems 28
Server Operating Systems 11
Embedded Operating Systems 1
Game Consoles 1

Category

Artificial Intelligence 633
Text Editors 280
Software Development 255
Multimedia 158
Business 127
Internet 113
System 100
Scientific/Engineering 98
Games 65
Education 62
Formats and Protocols 57
Communications 56
Desktop Environment 50
Security 36
Database 23
Terminals 22
Productivity 17
Printing 16
Social sciences 8
Religion and Philosophy 6
Blockchain 3
Mobile 2

License

OSI-Approved Open Source 1,572
Other License 8
Public Domain 5
Creative Commons Attribution License 4
More...
GNU Free Documentation License 1
Open Source Hardware 1

Translations

Programming Language

Status

Production/Stable 288
Beta 248
Alpha 139
Pre-Alpha 71
More...
Planning 40
Mature 28
Inactive 18

Showing 1599 open source projects for "open source png text"

View related business solutions

Python Clear Filters & Widen Search

Secure File Transfer for Windows with Cerberus by Redwood
Protect and share files over FTP/S, SFTP, HTTPS and SCP with the #1 rated Windows file transfer server.

Cerberus supports unlimited users and connections on a single IP, with built-in encryption, 2FA, and a browser-based web client — all deployable in under 15 minutes with a 25-day free trial.

Try for Free
Our Free Plans just got better! | Auth0
With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.

Try free now
1

CogVideo

Text and image to video generation: CogVideoX and CogVideo

CogVideo is an open-source family of advanced video generation models that can create videos from text, images, or existing video inputs. Built on large-scale Transformer and diffusion architectures, it enables multimodal generation across text-to-video, image-to-video, and video continuation tasks. The latest CogVideoX models offer higher resolution outputs, longer video durations, and improved controllability through prompt engineering.

Downloads: 25 This Week

Last Update: 2025-10-04
See Project
2

ChatTTS

A generative speech model for daily dialogue

ChatTTS is an open-source conversational text-to-speech model optimized for dialogue, developed by 2Noise. Trained on 100,000+ hours of English and Chinese conversation data, it excels at generating expressive prosody—pauses, interjections, laughter—for more natural-sounding speech synthesis in assistant and chatbot applications.

Downloads: 2 This Week

Last Update: 2026-04-10
See Project
3

WhisperX

Automatic Speech Recognition with Word-level Timestamps

WhisperX is an advanced speech recognition system built on top of OpenAI’s Whisper model, designed to improve transcription accuracy and timing precision for long-form audio. It addresses key limitations of standard Whisper implementations by introducing voice activity detection and forced alignment techniques to produce word-level timestamps. The system enables batched inference, significantly increasing transcription speed while maintaining high accuracy. It is particularly effective for...

Downloads: 30 This Week

Last Update: 7 days ago
See Project
4

HeartMuLa

A Family of Open Sourced Music Foundation Models

HeartMuLa is the open-source library and reference implementation for the HeartMuLa family of music foundation models, designed to support both music generation and music-related understanding tasks in a cohesive stack. At the center is HeartMuLa, a music language model that generates music conditioned on inputs like lyrics and tags, with multilingual support that broadens the range of lyric-driven use cases.

Downloads: 11 This Week

Last Update: 2026-04-10
See Project
Full-stack observability with actually useful AI | Grafana Cloud
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.

Create free account
5

Nano PDF Editor

Edit PDF files with Nano Banana

Nano PDF Editor is a minimalist, portable PDF viewer and toolkit that focuses on simplicity, speed, and ease of integration for applications that need basic PDF rendering without heavy dependencies. It provides core functionality such as page navigation, zooming, text selection, and rendering directly to native graphics surfaces, making it suitable for lightweight PDF viewing scenarios on desktop or embedded platforms. Designed to be easily embedded into larger software projects, Nano-PDF...

Downloads: 17 This Week

Last Update: 2026-02-05
See Project
6

Jupytext

Jupyter Notebooks as Markdown Documents, Julia, Python or R scripts

Have you always wished Jupyter notebooks were plain text documents? Wished you could edit them in your favorite IDE? And get clear and meaningful diffs when doing version control? Then, Jupytext may well be the tool you’re looking for. Only the notebook inputs (and optionally, the metadata) are included. Text notebooks are well suited for version control. You can also edit or refactor them in an IDE - the .py notebook above is a regular Python file. Text notebooks with a .py or .md extension...

Downloads: 2 This Week

Last Update: 2026-05-17
See Project
7

FLUX.2

Official inference repo for FLUX.2 models

FLUX.2 is a state-of-the-art open-weight image generation and editing model released by Black Forest Labs aimed at bridging the gap between research-grade capabilities and production-ready workflows. The model offers both text-to-image generation and powerful image editing, including editing of multiple reference images, with fidelity, consistency, and realism that push the limits of what open-source generative models have achieved.

Downloads: 33 This Week

Last Update: 2026-03-12
See Project
8

RealtimeSTT

A robust, efficient, low-latency speech-to-text library

RealtimeSTT is a Python-based realtime speech-to-text engine emphasizing low latency, wake-word detection, voice activity detection, and automatic speech segmentation. It provides asynchronous callbacks, nanosecond-precision timestamps, and CLI tools, suitable for building voice assistants, meeting transcribers, or live caption systems.

Downloads: 3 This Week

Last Update: 17 hours ago
See Project
9

FastAPI

FastAPI framework, high performance, easy to learn, fast to code

FastAPI is a modern, fast (high-performance), web framework for building APIs with Python 3.6+ based on standard Python type hints. Great editor support. Completion everywhere. Less time debugging. Designed to be easy to use and learn. Less time reading docs. Minimize code duplication. Multiple features from each parameter declaration. Fewer bugs. Get production-ready code. With automatic interactive documentation. Based on (and fully compatible with) the open standards for APIs: OpenAPI...

Downloads: 41 This Week

Last Update: 2026-05-23
See Project
Streamline Azure Security with Palo Alto Networks VM-Series
Centrally manage physical and virtualized firewalls with Panorama

Improve your security posture and reduce incident response time. Use the VM-Series to natively analyze Azure traffic and dynamically drive policy updates based on workload changes.

Learn more
10

Unredact

A simple tool for reading in poorly redacted documents

Unredact is a specialized tool that attempts to reconstruct redacted or obscured text in images, PDFs, or screenshots using a combination of image processing and generative AI inference to suggest plausible completions of blurred, black-boxed, or jumbled content. Unlike traditional optical character recognition (OCR), which only reads visible text, Unredact focuses on inferring missing content where redaction has been applied by analyzing surrounding context, font characteristics, and...

Downloads: 14 This Week

Last Update: 2026-02-03
See Project
11

doccano

Open source annotation tool for machine learning practitioners

doccano is an open-source text annotation tool for humans. It provides annotation features for text classification, sequence labeling and sequence-to-sequence tasks. So, you can create labeled data for sentiment analysis, named entity recognition, text summarization and so on. Just create a project, upload data and start annotating. You can build a dataset in hours.

Downloads: 0 This Week

Last Update: 2026-01-11
See Project
12

gTTS

Python library and CLI tool to interface with Google Translate

gTTS (Google Text-to-Speech) is a Python library and command-line tool that wraps the speech functionality of Google Translate. It lets you send text to the Google Translate TTS endpoint and receive spoken audio back as MP3 data, either written to a file, a file-like object, or standard output. The library is designed to handle long texts, using a speech-specific sentence tokenizer that keeps intonation and punctuation natural while splitting requests into acceptable chunks. It supports...

Downloads: 6 This Week

Last Update: 2025-11-28
See Project
13

RealtimeTTS

Converts text to speech in realtime

RealtimeTTS is a low-latency text-to-speech library built for real-time applications such as voice chat with LLMs, assistants, and interactive tools. It is designed around a streaming model: you can feed it text incrementally (for example, as an LLM responds) and get audio output almost immediately, which keeps end-to-end latency very low. The library is engine-agnostic and plugs into a wide range of cloud and local TTS systems, including OpenAI, ElevenLabs, Azure, Coqui, Piper, StyleTTS2,...

Downloads: 5 This Week

Last Update: 6 days ago
See Project
14

Label Studio

Label Studio is a multi-type data labeling and annotation tool

...Detect objects on image, bboxes, polygons, circular, and keypoints supported. Partition image into multiple segments. Use ML models to pre-label and optimize the process. Label Studio is an open-source data labeling tool. It lets you label data types like audio, text, images, videos, and time series with a simple and straightforward UI and export to various model formats. It can be used to prepare raw data or improve existing training data to get more accurate ML models. The frontend part of Label Studio app lies in the frontend/ folder and written in React JSX. ...

Downloads: 18 This Week

Last Update: 2026-03-13
See Project
15

Nexa SDK

Nexa SDK is a comprehensive toolkit for supporting ONNX and GGML

Nexa SDK is a comprehensive toolkit for supporting ONNX and GGML models. It supports text generation, image generation, vision-language models (VLM), and speech-to-text (ASR), and text-to-speech (TTS) capabilities. Additionally, it offers an OpenAI-compatible API server with JSON schema mode for function calling and streaming support, and a user-friendly Streamlit UI. Users can run Nexa SDK in any device with Python environment, and GPU acceleration is supported, including CUDA, Metal, and...

Downloads: 7 This Week

Last Update: 2026-02-20
See Project
16

Label Sleuth

Open source no-code system for text annotation and building of text

An open-source no-code system for text annotation and building text classifiers. No AI knowledge needed. From task definition to working model in just a few hours! While domain experts label their data, Label Sleuth automatically trains in the background-appropriate machine learning models. To avoid wasted labeling effort, Label Sleuth employs active learning techniques to guide the user in what they should be labeled next.

Downloads: 0 This Week

Last Update: 2024-06-17
See Project
17

MiniMax-MCP

Official MiniMax Model Context Protocol (MCP) server

MiniMax-MCP is the official Model Context Protocol (MCP) server for accessing MiniMax’s multimodal generative APIs from MCP-compatible clients. It acts as a bridge between tools like Claude Desktop, Cursor, Windsurf, OpenAI Agents, and the MiniMax platform, exposing capabilities such as text-to-speech, voice cloning, image generation, text-to-image, video generation, image-to-video, text-to-video, and music generation. The server is written in Python and distributed under the MIT license,...

Downloads: 3 This Week

Last Update: 2026-05-21
See Project
18

AutoCut

Cut videos with a text editor

AutoCut is an innovative tool that lets users edit and cut videos using a text-centric workflow instead of a traditional video editor. AutoCut automatically generates subtitles or transcripts for uploaded videos, and users can simply edit the text file to select the segments of the video they want to keep. This approach transforms video editing into a textual editing task, greatly lowering the barrier to editing for users who find traditional video editors complex or unintuitive. AutoCut...

Downloads: 1 This Week

Last Update: 2026-02-06
See Project
19

Applio

A simple, high-quality voice conversion tool focused on ease of use

Applio is a high-quality voice conversion toolkit designed to make modern RVC/VITS-based voice cloning accessible to non-experts. It focuses strongly on ease of use: installation scripts for Windows, Linux, and macOS set up dependencies and then launch a browser-based Gradio interface. Within that interface, users can train and run voice conversion models for tasks like singing conversion, speech-to-speech transformation, and voice cloning. The project is structured to be flexible through...

Downloads: 104 This Week

Last Update: 2026-02-18
See Project
20

DeepSeek VL2

Mixture-of-Experts Vision-Language Models for Advanced Multimodal

DeepSeek-VL2 is DeepSeek’s vision + language multimodal model—essentially the next-gen successor to their first vision-language models. It combines image and text inputs into a unified embedding / reasoning space so that you can query with text and image jointly (e.g. “What’s going on in this scene?” or “Generate a caption appropriate to context”). The model supports both image understanding (vision tasks) and multimodal reasoning, and is likely used as a component in agent systems to...

Downloads: 9 This Week

Last Update: 2025-10-03
See Project
21

pyttsx3

Offline Text To Speech synthesis for python

pyttsx3 is an offline text-to-speech library for Python that wraps native speech engines instead of calling cloud APIs. It is designed to work entirely without an internet connection, making it suitable for local automation, kiosks, accessibility tools, and embedded applications. On Windows it uses SAPI5, on Linux it typically uses eSpeak or eSpeak-NG, and on macOS it can use NSSpeechSynthesizer or AVSpeechSynthesizer, giving it broad cross-platform compatibility. The library exposes a...

Downloads: 12 This Week

Last Update: 2025-11-28
See Project
22

Pixelorama

A free & open-source 2D sprite editor, made with the Godot Engine

Pixelorama is a free and open-source pixel art editor, proudly created with the Godot Engine, by Orama Interactive. Whether you want to make animated pixel art, game graphics, tiles and any kind of pixel art you want, Pixelorama has you covered with its variety of tools and features. Free to use for everyone, forever. A variety of different tools to help you draw, with the ability to map a different tool in each left and right mouse buttons. Are you an animator? Pixelorama has its own...

Downloads: 38 This Week

Last Update: 2026-04-29
See Project
23

Transformers

State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX

Hugging Face Transformers provides APIs and tools to easily download and train state-of-the-art pre-trained models. Using pre-trained models can reduce your compute costs, carbon footprint, and save you the time and resources required to train a model from scratch. These models support common tasks in different modalities. Text, for tasks like text classification, information extraction, question answering, summarization, translation, text generation, in over 100 languages. Images, for tasks...

Downloads: 3 This Week

Last Update: 2026-05-20
See Project
24

Memvid

Video-based AI memory library. Store millions of text chunks in MP4

Memvid encodes text chunks as QR codes within MP4 frames to build a portable “video memory” for AI systems. This innovative approach uses standard video containers and offers millisecond-level semantic search across large corpora with dramatically less storage than vector DBs. It's self-contained—no DB needed—and supports features like PDF indexing, chat integration, and cloud dashboards.

Downloads: 8 This Week

Last Update: 4 days ago
See Project
25

VideoCaptioner

AI-powered tool for generating, optimizing, and translating subtitles

VideoCaptioner is an open source AI-powered subtitle processing tool designed to simplify the workflow of creating subtitles for videos. It integrates speech recognition, language processing, and translation technologies to automatically generate and refine subtitles from video or audio sources. VideoCaptioner uses speech-to-text engines such as Whisper variants to transcribe spoken content and convert it into subtitle text with accurate timestamps.

Downloads: 17 This Week

Last Update: 2026-05-24
See Project

Previous
1
2
You're on page 3
4
5
6
7
Next

Related Searches

pdf editor portable

pdf editor

pdf

edit pdf

applio

portable nitro pdf

pyttsx3

image to video

ai

portable pdf editor

Related Categories

Artificial Intelligence

Text Editors

Software Development

Multimedia

Business

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise