Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Search Results

Search Results for "text processing" - Page 6

x

Sort By:

Relevance

Clear All Filters

OS

Linux 311
Windows 299
Mac 254
More...
BSD 170
ChromeOS 138
Desktop Operating Systems 5
Mobile Operating Systems 4
Game Consoles 1
Server Operating Systems 1

Category

Artificial Intelligence 174
Text Editors 143
Software Development 54
Business 27
Scientific/Engineering 24
Internet 21
Multimedia 16
Formats and Protocols 14
Printing 8
System 8
Education 7
Desktop Environment 6
Communications 4
Database 4
Games 2
Productivity 2
Security 2
Terminals 1

License

OSI-Approved Open Source 318
Public Domain 5
Creative Commons Attribution License 2
Other License 2

Translations

English 76
German 7
Chinese (Simplified) 6
French 5
More...
Japanese 5
Spanish 5
Russian 4
Arabic 3
Italian 3
Korean 2
Portuguese 2
Turkish 2
Brazilian Portuguese 1
Bulgarian 1
Chinese (Traditional) 1
Dutch 1
Hindi 1
Indonesian 1
Polish 1
Tamil 1
Thai 1
Ukrainian 1

Programming Language

Python 346
C++ 19
C 13
Java 9
JavaScript 8
More...
Unix Shell 8
Perl 6
PHP 5
Ruby 4
XSL (XSLT/XPath/XSL-FO) 3
C# 2
Lua 2
Rust 2
AutoIt 1
BASIC 1
Emacs-Lisp 1
Prolog 1
S/R 1
XBasic 1

Status

Production/Stable 65
Beta 44
Alpha 29
Pre-Alpha 19
More...
Planning 11
Inactive 10
Mature 5

Showing 346 open source projects for "text processing"

View related business solutions

Python Clear Filters & Widen Search

Gemini 3 and 200+ AI Models on One Platform
Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.

Start Free
Full-stack observability with actually useful AI | Grafana Cloud
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.

Create free account
1

Eng2BN CSV Translator

Translate English to Bangla using CSV file format and range wise.

Eng2BN CSV Translator user-friendly Python tool that enables efficient translation of English text to Bangla within CSV files. The application supports large datasets and allows users to translate specific row ranges, making it ideal for batch processing.

Downloads: 0 This Week

Last Update: 2025-12-06
See Project
2

Free AI Watermark Remover - FreeRepair

AI-powered tool to quickly remove watermarks from images flawlessly

AI Watermark Remover (Free And Open-Source) & Make Blurry Images Clearer Or Larger Tool - FreeRepair, Simulation IOPaint Based On The Django Of Python With No Sign-Up. As a free, open-source, AI-powered tool, FreeRepair makes it easy to remove watermarks, logos, text or clutter from images, and blurry images can be made clearer or larger. No installation, no internet connection, it works out of the box, safe and secure, unlimited.

1 Review

Downloads: 38 This Week

Last Update: 2026-03-30
See Project
3

CSM (Conversational Speech Model)

A Conversational Speech Generation Model

The CSM (Conversational Speech Model) is a speech generation model developed by Sesame AI that creates RVQ audio codes from text and audio inputs. It uses a Llama backbone and a smaller audio decoder to produce audio codes for realistic speech synthesis. The model has been fine-tuned for interactive voice demos and is hosted on platforms like Hugging Face for testing. CSM offers a flexible setup and is compatible with CUDA-enabled GPUs for efficient execution.

Downloads: 7 This Week

Last Update: 2025-03-19
See Project
4

AudioBC

Offline desktop app to convert EPUB to MP3 using Kokoro-82M neural TTS

...Key Features: Neural Quality TTS: Uses the compact yet powerful Kokoro-82M model for high-fidelity, expressive voice synthesis. Privacy-First & Offline: After a one-time initial model download, all processing happens on your CPU. Your books never leave your computer. Multi-Language Support: Curated voices for English (US & UK), Italian, French, Spanish, and Portuguese (BR). Smart Extraction: Automatically filters out non-narrative cont

Downloads: 0 This Week

Last Update: 2026-03-22
See Project
Compliant and Reliable File Transfers Backed by Top Security Certifications
Cerberus FTP Server delivers SOC 2 Type II certified security and FIPS 140-2 validated encryption.

Stop relying on non-certified, legacy file transfer tools that creak under the weight of modern security demands. Get full audit trails, advanced access controls and more supported by an award-winning team of experts. Start your free 25-day trial today.

Start Free Trial
5

BWR Ai watermark remover

AI-powered tool to quickly remove watermarks from videos and photo

Blue Wave Remover is an advanced AI-driven video watermark removal software designed to effortlessly eliminate logos, text, timestamps, and watermarks from video content. Utilizing cutting-edge computer vision and generative AI algorithms, it accurately detects and removes both static and moving watermarks while preserving the original video's quality, colors, and clarity. The program supports popular video formats and offers batch processing for fast and efficient removal on multiple files. ...

1 Review

Downloads: 21 This Week

Last Update: 2026-05-10
See Project
6

LexiFinder

AI-powered semantic indexing: automating the creation of book indexes

...Given one or more source documents and a set of keywords, it extracts all nouns, compares them semantically to the keywords using a pretrained NLP model, and produces a structured, hierarchical index ready to be included in a book or manuscript. LexiFinder works in two ways: as a command-line tool for scripting, automation, and batch processing, and as a graphical application for a guided, point-and-click experience. Both interfaces share the same underlying engine and support the same features. Supported input formats are PDF, DOCX, and ODT. The index can be exported as plain text, JSON, CSV, or HTML.

Downloads: 1 This Week

Last Update: 2026-03-04
See Project
7

Advanced Trigonometry Calculator

Precision Trigonometry: Advanced Calculator for Complex Math

Advanced Trigonometry Calculator is equipped with a user-friendly interface that allows for easy input of problems and instant computation. Professionals such as engineers who need to perform advanced trigonometric calculations in their work will find this tool extremely useful. ATC Online Alpha: https://advantrigoncalc.sourceforge.io/atc/ More info by clicking below: https://advantrigoncalc.sourceforge.io/ Advanced Trigonometry Calculator was only and always only developed by...

2 Reviews

Downloads: 18 This Week

Last Update: 2026-01-15
See Project
8

File Sorter for Photographers

Organize files/images from a csv or xlsx file.

A user-friendly application to efficiently sort all types of files from a source folder into a destination folder based on a list of filenames provided in an Excel or CSV file.

1 Review

Downloads: 0 This Week

Last Update: 2025-12-14
See Project
9

Obsei

Obsei is a low code AI powered automation tool

Obsei is an automated no-code/low-code AI-powered text observation and analysis framework, designed for extracting insights from unstructured text data such as social media, reviews, and logs.

Downloads: 0 This Week

Last Update: 2025-01-24
See Project
Our Free Plans just got better! | Auth0
With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.

Try free now
10

Wikipedia2Vec

A tool for learning vector representations of words and entities

Wikipedia2Vec is an embedding learning tool that creates word and entity vector representations from Wikipedia, enabling NLP models to leverage structured and contextual knowledge.

Downloads: 0 This Week

Last Update: 2025-01-24
See Project
11

Transformers4Rec

Transformers4Rec is a flexible and efficient library

Transformers4Rec is an advanced recommendation system library that leverages Transformer models for sequential and session-based recommendations. The library works as a bridge between natural language processing (NLP) and recommender systems (RecSys) by integrating with one of the most popular NLP frameworks, Hugging Face Transformers (HF). Transformers4Rec makes state-of-the-art transformer architectures available for RecSys researchers and industry practitioners. Traditional recommendation...

Downloads: 0 This Week

Last Update: 2025-01-24
See Project
12

YAYI

Repo for YaYi Chinese LLMs based on LlaMA2 & BLOOM

YAYI is an open-source large language model project developed to provide a multilingual conversational AI system capable of performing a wide variety of natural language processing tasks. The model is trained on diverse datasets covering multiple languages and domains so that it can support applications ranging from dialogue systems to text analysis and knowledge retrieval. The architecture is based on transformer-style language models optimized for conversational understanding and generation. In addition to producing coherent responses, the system is designed to handle tasks such as summarization, translation, question answering, and text classification. ...

Downloads: 0 This Week

Last Update: 2026-03-05
See Project
13

pdf combiner merger converter splitter

PDF Combiner is a user-friendly, GUI-based tool built in

PDF Combiner is a user-friendly open source free to use, GUI-based tool for combining, pdf to excel, pdf to word, image to pdf, zip, unzip annotate and splitting PDF files. It is easy to use, supports multiple file insert and delete and process, and allows you to adjust the order of files before combining.

1 Review

Downloads: 2 This Week

Last Update: 2024-05-03
See Project
14

text-dedup

All-in-one text de-duplication

text-dedup is a Python library that enables efficient deduplication of large text corpora by using MinHash and other probabilistic techniques to detect near-duplicate content. This is especially useful for NLP tasks where duplicated training data can skew model performance. text-dedup scales to billions of documents and offers tools for chunking, hashing, and comparing text efficiently with low memory usage. It supports Jaccard similarity thresholding, parallel execution, and flexible...

Downloads: 0 This Week

Last Update: 2025-04-08
See Project
15

GPT-2 Output Dataset

Dataset of GPT-2 outputs for research in detection, biases, and more

The GPT-2 Output Dataset is a large collection of model-generated text, released by OpenAI alongside the GPT-2 research paper to study the behaviors and limitations of large language models. It contains 250,000 samples of GPT-2 outputs, generated with different sampling strategies such as top-k truncation, to highlight the diversity and quality of model completions. The dataset also includes corresponding human-written text for comparison, enabling researchers to explore methods for...

Downloads: 0 This Week

Last Update: 5 days ago
See Project
16

ddgr

DuckDuckGo from the terminal

...The tool also supports options like opening a selected result in a web browser, piping results into other tools, and restricting searches to specific formats such as text-only or JSON for further processing. Because it avoids third-party tracking and ads built into many browser search experiences, ddgr appeals to users seeking greater control over data and a faster, distraction-free search flow.

Downloads: 0 This Week

Last Update: 2026-01-26
See Project
17

towhee

Framework that is dedicated to making neural data processing

...Towhee includes a pythonic method-chaining API for describing custom data processing pipelines. We also support schemas, making processing unstructured data as easy as handling tabular data.

Downloads: 0 This Week

Last Update: 2023-12-05
See Project
18

MahaKurawa.My.ID URL Extractor

MahaKurawa.My.ID URL Extractor is Simple Tool to extract unique URL

MahaKurawa.My.ID URL Extractor is Simple Tool to extract unique URL from any text content in instant. It's useful when you lazy enough to identify and copy-paste URL from your content one by one by yourself.

Downloads: 0 This Week

Last Update: 2024-05-01
See Project
19

UniEM

Unified embedding model

UniEM is a unified embedding model designed to create high-quality text embeddings for various natural language processing tasks.

Downloads: 0 This Week

Last Update: 2025-01-30
See Project
20

Medusa

Framework for Accelerating LLM Generation with Multiple Decoding Heads

Medusa is a framework aimed at accelerating the generation capabilities of Large Language Models (LLMs) by employing multiple decoding heads. This approach allows for parallel processing during text generation, significantly enhancing throughput and reducing response times. Medusa is designed to be simple to implement and integrates with existing LLM infrastructures, making it a practical solution for scaling LLM applications.

Downloads: 0 This Week

Last Update: 2025-03-19
See Project
21

funNLP

Resources, corpora, and tools for Chinese natural language processing

FunNLP is a large, curated collection of resources, corpora, and tools for Chinese natural language processing (NLP). It aggregates datasets, lexicons, wordlists, sentiment dictionaries, knowledge graphs, and pretrained model references, serving as a one-stop resource hub for Chinese NLP practitioners. The repository is organized into categories such as sentiment analysis, text classification, named entity recognition, knowledge graphs, and various lexicons (e.g. sensitive words, emotion dictionaries, stopwords). ...

Downloads: 0 This Week

Last Update: 2025-10-01
See Project
22

Promptify

se GPT or other prompt based models to get structured output

Promptify is an open-source Python library designed to simplify prompt engineering and the development of natural language processing pipelines using large language models. The project provides tools that help developers generate structured prompts for different NLP tasks and apply them across multiple generative AI systems. Instead of manually crafting prompts for each task, Promptify introduces a unified architecture that combines prompt templates, language model interfaces, and processing pipelines into a single framework. ...

Downloads: 0 This Week

Last Update: 2026-03-15
See Project
23

Prime QA

State-of-the-art Multilingual Question Answering research

PrimeQA is a public open source repository that enables researchers and developers to train state-of-the-art models for question answering (QA). By using PrimeQA, a researcher can replicate the experiments outlined in a paper published in the latest NLP conference while also enjoying the capability to download pre-trained models (from an online repository) and run them on their own custom data. PrimeQA is built on top of the Transformers toolkit and uses datasets and models that are directly...

Downloads: 0 This Week

Last Update: 2023-08-21
See Project
24

auto-subtitle

Automatically generate and overlay subtitles for any video

auto-subtitle is a Python-based command-line tool that automatically generates and overlays subtitles on video files using AI-driven speech recognition. It combines FFmpeg with OpenAI’s Whisper model to transcribe spoken audio into text and synchronize it with video playback. The tool processes video input, extracts audio, and produces subtitle files that can be either exported separately or burned directly into the final video output. It supports multiple transcription models with varying...

Downloads: 0 This Week

Last Update: 2026-04-24
See Project
25

textacy

NLP, before and after spaCy

textacy is a Python library for performing a variety of natural language processing (NLP) tasks, built on the high-performance spaCy library. With the fundamentals, tokenization, part-of-speech tagging, dependency parsing, etc., delegated to another library, textacy focuses primarily on the tasks that come before and follow after.

Downloads: 0 This Week

Last Update: 2025-01-22
See Project

Previous
2
3
4
5
You're on page 6
7
8
9
10
Next

Related Searches

watermark removal tool

remove watermark from video

mega-voice

sapi 5 voices

download installer

pdf to excel converter

towhee

m3u url extractor

speech

sapi5

Related Categories

Artificial Intelligence

Text Editors

Software Development

Business

Scientific/Engineering

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise