Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Search Results

Search Results for "python text parser" - Page 4

x

Sort By:

Relevance

Clear All Filters

OS

Linux 1,640
Windows 1,393
Mac 1,270
More...
BSD 748
ChromeOS 546
Desktop Operating Systems 48
Mobile Operating Systems 26
Server Operating Systems 13
Game Consoles 3
Embedded Operating Systems 1

Category

Artificial Intelligence 472
Software Development 401
Text Editors 333
Internet 143
Business 124
Scientific/Engineering 112
System 111
Formats and Protocols 100
Multimedia 100
Communications 71
Games 64
Education 59
Desktop Environment 51
Database 29
Security 29
Terminals 22
Printing 17
Productivity 10
Social sciences 6
Religion and Philosophy 3
Blockchain 2
Mobile 2

License

OSI-Approved Open Source 1,436
Creative Commons Attribution License 25
Public Domain 24
Other License 12
More...
GNU Free Documentation License 4

Translations

Programming Language

Status

Production/Stable 321
Beta 282
Alpha 152
Pre-Alpha 88
More...
Planning 47
Mature 40
Inactive 18

Showing 1640 open source projects for "python text parser"

View related business solutions

Linux Clear Filters & Widen Search

Our Free Plans just got better! | Auth0
With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.

Try free now
Gen AI apps are built with MongoDB Atlas
Build gen AI apps with an all-in-one modern database: MongoDB Atlas

MongoDB Atlas provides built-in vector search and a flexible document model so developers can build, scale, and run gen AI apps without stitching together multiple databases. From LLM integration to semantic search, Atlas simplifies your AI architecture—and it’s free to get started.

Start Free
1

MTEB

MTEB: Massive Text Embedding Benchmark

Text embeddings are commonly evaluated on a small set of datasets from a single task not covering their possible applications to other tasks. It is unclear whether state-of-the-art embeddings on semantic textual similarity (STS) can be equally well applied to other tasks like clustering or reranking. This makes progress in the field difficult to track, as various models are constantly being proposed without proper evaluation. To solve this problem, we introduce the Massive Text Embedding...

Downloads: 4 This Week

Last Update: 4 days ago
See Project
2

Mozc

Mozc - a Japanese Input Method Editor designed for multi-platform

Mozc is an open source Japanese Input Method Editor (IME) developed by Google, designed to provide Japanese text input across multiple operating systems including Android, macOS, Windows, GNU/Linux, and Chromium OS. The project originated as a subset of Google Japanese Input, released publicly under the BSD 3-Clause license for community use and development. Mozc offers core IME functionality such as text conversion, prediction, and dictionary-based input, enabling users to efficiently type...

Downloads: 6 This Week

Last Update: 1 day ago
See Project
3

Powerline

Statusline plugin for vim with prompts for several other applications

Powerline is a statusline plugin for vim, and provides statuslines and prompts for several other applications, including zsh, bash, tmux, IPython, Awesome, i3 and Qtile. Powerline was completely rewritten in Python to get rid of as much vimscript as possible. This has allowed much better extensibility, leaner and better config files, and a structured, object-oriented codebase with no mandatory third-party dependencies other than a Python interpreter. Using Python has allowed unit testing of...

Downloads: 0 This Week

Last Update: 2024-08-29
See Project
4

Concordia

Crowdsourcing platform for full text transcription and tagging

Concordia is a platform for crowdsourcing transcription and tagging of text in digitized images. It was developed by the Library of Congress so that volunteers of all backgrounds could transcribe and tag digitized images of manuscripts and typed materials from the Library’s collections that could not otherwise be done by optical character recognition.

Downloads: 1 This Week

Last Update: 2025-09-09
See Project
Cloud-based help desk software with ServoDesk
Full access to Enterprise features. No credit card required.

What if You Could Automate 90% of Your Repetitive Tasks in Under 30 Days? At ServoDesk, we help businesses like yours automate operations with AI, allowing you to cut service times in half and increase productivity by 25% - without hiring more staff.

Try ServoDesk for free
5

AudioCraft

Audiocraft is a library for audio processing and generation

AudioCraft is a PyTorch library for text-to-audio and text-to-music generation, packaging research models and tooling for training and inference. It includes MusicGen for music generation conditioned on text (and optionally melody) and AudioGen for text-conditioned sound effects and environmental audio. Both models operate over discrete audio tokens produced by a neural codec (EnCodec), which acts like a tokenizer for waveforms and enables efficient sequence modeling. ...

Downloads: 0 This Week

Last Update: 2025-10-13
See Project
6

Lingua-Py

The most accurate natural language detection library for Python

Its task is simple: It tells you which language some text is written in. This is very useful as a preprocessing step for linguistic data in natural language processing applications such as text classification and spell checking. Other use cases, for instance, might include routing e-mails to the right geographically located customer service department, based on the e-mails' languages. Language detection is often done as part of large machine learning frameworks or natural language processing...

Downloads: 1 This Week

Last Update: 2025-05-27
See Project
7

HunyuanWorld 1.0

Generating Immersive, Explorable, and Interactive 3D Worlds

HunyuanWorld-1.0 is an open-source, simulation-capable 3D world generation model developed by Tencent Hunyuan that creates immersive, explorable, and interactive 3D environments from text or image inputs. It combines the strengths of video-based diversity and 3D-based geometric consistency through a novel framework using panoramic world proxies and semantically layered 3D mesh representations. This approach enables 360° immersive experiences, seamless mesh export for graphics pipelines, and...

Downloads: 17 This Week

Last Update: 2025-10-22
See Project
8

kokoro-onnx

TTS with kokoro and onnx runtime

kokoro-onnx is a text-to-speech toolkit that wraps the Kokoro neural TTS model in an easy-to-use ONNX Runtime interface, so you can generate speech from Python with minimal setup. It focuses on running efficiently on commodity hardware, including macOS with Apple Silicon, while still delivering near real-time performance for many use cases. The project ships prebuilt model files and a simple example script, so you can go from installation to producing an audio.wav file in just a few steps. ...

Downloads: 3 This Week

Last Update: 4 days ago
See Project
9

Hazm

Persian NLP Toolkit

Hazm is a natural language processing (NLP) library for Persian text, offering various tools for text preprocessing, tokenization, part-of-speech tagging, and more.

Downloads: 0 This Week

Last Update: 2025-01-24
See Project
Keep company data safe with Chrome Enterprise
Protect your business with AI policies and data loss prevention in the browser

Make AI work your way with Chrome Enterprise. Block unapproved sites and set custom data controls that align with your company's policies.

Download Chrome
10
$1258baefa1ba45684ba74f4fc70f7d7dbd4af9b2aa997254aa88696b73b49100?&w=120$

Texify

Math OCR model that outputs LaTeX and markdown

Texify is an OCR model that converts images or pdfs containing math into markdown and LaTeX that can be rendered by MathJax ($$ and $ are delimiters). It can run on CPU, GPU, or MPS.

Downloads: 5 This Week

Last Update: 2024-10-31
See Project
11

Pixoo

A library to help you make the most out of your Pixoo 64

Pixoo is a Python-based library for controlling Divoom Pixoo LED displays using Bluetooth Low Energy (BLE). It allows users to send images, animations, or text to Pixoo devices, enabling creative integrations like desktop widgets, real-time data displays, or custom artwork.

Downloads: 2 This Week

Last Update: 2025-05-28
See Project
12

Free VHDL Parser with Java, Python and T

IEEE VHDL-93 LRM supported parser implemented in Java, APIs Python/Tcl

This parser has been developed for those who wants to develop his/her own tool around VHDL RTL. Only synthesizable subset of VHDL is supported and it may not work for machine/tool generated VHDL files. This parser has been developed in Java in order to make it platform independent. It reads RTL and populates its internal object model. There are APIs to extract the design information from the database, APIs to elaborate the design along with expression evaluation capabilities. This tool has...

Downloads: 0 This Week

Last Update: 2025-06-25
See Project
13

EmotiVoice

Multi-Voice and Prompt-Controlled TTS Engine

...EmotiVoice provides multiple ways to interact with it, including a web interface, a Docker image, an HTTP API (including an OpenAI-compatible TTS API), and Python scripts for batch synthesis. It also supports voice cloning with your own data, backed by recipes for popular datasets like DataBaker and LJSpeech, so you can train or adapt voices to custom personas.

Downloads: 5 This Week

Last Update: 2 days ago
See Project
14

TextDistance

Compute distance between sequences

Python library for comparing the distance between two or more sequences by many algorithms. For main algorithms, text distance try to call known external libraries (fastest first) if available (installed in your system) and possible (this implementation can compare this type of sequences). Install text distance with extras for this feature.

Downloads: 0 This Week

Last Update: 2024-07-16
See Project
15

PackageDev

Tools to ease the creation of snippets, syntax definitions, etc.

PackageDev provides syntax highlighting and other helpful utility for Sublime Text resource files. Resource files are ways of configuring the Sublime Text text editor to various extends, including but not limited to: custom syntax definitions, context menus (and the main menu), and key bindings. Thus, this package is ideal for package developers, but even normal users of Sublime Text who want to configure it to their liking should find it very useful.

Downloads: 2 This Week

Last Update: 2025-05-24
See Project
16

Stanza

Stanford NLP Python library for many human languages

Stanza is a collection of accurate and efficient tools for the linguistic analysis of many human languages. Starting from raw text to syntactic analysis and entity recognition, Stanza brings state-of-the-art NLP models to languages of your choosing. Stanza is a Python natural language analysis package. It contains tools, which can be used in a pipeline, to convert a string containing human language text into lists of sentences and words, to generate base forms of those words, their parts of speech and morphological features, to give a syntactic structure dependency parse, and to recognize named entities. ...

Downloads: 2 This Week

Last Update: 2025-10-05
See Project
17

Parlant

The behavior guidance framework for customer-facing LLM agents

Parlant is a lightweight speech-to-text and text-to-speech framework designed for real-time AI-driven voice applications.

Downloads: 0 This Week

Last Update: 2025-11-18
See Project
18

Phenaki - Pytorch

Implementation of Phenaki Video, which uses Mask GIT

Implementation of Phenaki Video, which uses Mask GIT to produce text-guided videos of up to 2 minutes in length, in Pytorch. It will also combine another technique involving a token critic for potentially even better generations. A new paper suggests that instead of relying on the predicted probabilities of each token as a measure of confidence, one can train an extra critic to decide what to iteratively mask during sampling. This repository will also endeavor to allow the researcher to...

Downloads: 1 This Week

Last Update: 2024-07-29
See Project
19

OpenAI-Compatible Edge-TTS API

Free, high-quality text-to-speech API endpoint to replace OpenAI

OpenAI-Compatible Edge-TTS API is a local, OpenAI-compatible text-to-speech API that uses edge-tts—Microsoft Edge’s online TTS service—as the backend. The project emulates the /v1/audio/speech endpoint used by OpenAI, so any client that can talk to the OpenAI TTS API can be redirected to this service with minimal changes. It exposes parameters for input text, voice selection, audio format, and playback speed, mirroring the OpenAI interface while mapping popular OpenAI voice names to...

Downloads: 3 This Week

Last Update: 4 days ago
See Project
20

Label Sleuth

Open source no-code system for text annotation and building of text

An open-source no-code system for text annotation and building text classifiers. No AI knowledge needed. From task definition to working model in just a few hours! While domain experts label their data, Label Sleuth automatically trains in the background-appropriate machine learning models. To avoid wasted labeling effort, Label Sleuth employs active learning techniques to guide the user in what they should be labeled next. Domain experts can quickly start labeling their data through an...

Downloads: 0 This Week

Last Update: 2024-06-17
See Project
21

PaperQA2

High accuracy RAG for answering questions from scientific documents

PaperQA2 is a package for doing high-accuracy retrieval augmented generation (RAG) on PDFs or text files, with a focus on the scientific literature. See our recent 2024 paper to see examples of PaperQA2's superhuman performance in scientific tasks like question answering, summarization, and contradiction detection. In this example we take a folder of research paper PDFs, magically get their metadata - including citation counts and a retraction check, then parse and cache PDFs into a...

Downloads: 2 This Week

Last Update: 2025-08-26
See Project
22

ChatTTS webUI & API

A simple native web interface that uses ChatTTS to synthesize text

ChatTTS-ui is a local web interface and API wrapper around the ChatTTS speech synthesis system, designed to make advanced TTS models easy to use from a browser. It runs a small backend server (Python + Torch + ffmpeg) and exposes a simple webpage where you can type text, adjust parameters, and generate audio. The project supports Chinese, English, and mixed text with digits and control symbols, making it suitable for bilingual content and numerically heavy text like announcements or prompts. From version 0.96 onward, ffmpeg installation is required for deployment, and previous CSV/PT voice tables are no longer valid, so users instead work with updated “voice value” parameters. ...

Downloads: 1 This Week

Last Update: 4 days ago
See Project
23

StyleTTS 2

Towards Human-Level Text-to-Speech through Style Diffusion

StyleTTS2 is a state-of-the-art text-to-speech system that aims for human-level naturalness by combining style diffusion, adversarial training, and large speech language models. It extends the original StyleTTS idea by introducing a style diffusion model that can sample rich, realistic speaking styles conditioned on reference speech, allowing highly expressive and diverse prosody. The architecture uses a two-stage training process and leverages an auxiliary speech language model to guide...

Downloads: 3 This Week

Last Update: 4 days ago
See Project
24

HunyuanImage-3.0

A Powerful Native Multimodal Model for Image Generation

HunyuanImage-3.0 is a powerful, native multimodal text-to-image generation model released by Tencent’s Hunyuan team. It unifies multimodal understanding and generation in a single autoregressive framework, combining text and image modalities seamlessly rather than relying on separate image-only diffusion components. It uses a Mixture-of-Experts (MoE) architecture with many expert subnetworks to scale efficiently, deploying only a subset of experts per token, which allows large parameter...

1 Review

Downloads: 3 This Week

Last Update: 2025-10-31
See Project
25

Rasa

Open source machine learning framework to automate text conversations

...Rasa uses Poetry for packaging and dependency management. If you want to build it from the source, you have to install Poetry first. By default, Poetry will try to use the currently activated Python version to create the virtual environment for the current project automatically.

Downloads: 19 This Week

Last Update: 2025-01-14
See Project

Previous
1
2
3
You're on page 4
5
6
7
8
Next

Related Searches

arabic text to speech

optical character recognition delphi

morphological analysis

question paper generator

rasa

socks5 server

virtual machine

cluster management software

apache

Related Categories

Artificial Intelligence

Software Development

Text Editors

Internet

Business

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2025 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise

×

Thanks for helping keep SourceForge clean.

X

You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

Briefly describe the problem (required):

Upload screenshot of ad (required):

Select a file, or drag & drop file here.

✔

✘

Screenshot instructions:

Click URL instructions:
Right-click on the ad, choose "Copy Link", then paste here →
(This may not be possible with some types of ads)

More information about our ad policies

Ad destination/click URL: