Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Search Results

Search Results for "python text parser" - Page 21

x

Sort By:

Relevance

Clear All Filters

OS

Linux 1,517
Windows 1,461
Mac 1,210
More...
BSD 777
ChromeOS 616
Desktop Operating Systems 42
Mobile Operating Systems 35
Server Operating Systems 11
Game Consoles 2
Embedded Operating Systems 1

Category

Artificial Intelligence 667
Software Development 310
Text Editors 306
Internet 133
Multimedia 130
Business 128
Scientific/Engineering 112
System 111
Games 76
Formats and Protocols 72
Communications 70
Education 65
Desktop Environment 52
Security 42
Database 26
Terminals 25
Productivity 20
Printing 16
Social sciences 8
Religion and Philosophy 6
Blockchain 3
Mobile 2

License

OSI-Approved Open Source 1,579
Creative Commons Attribution License 27
Public Domain 25
Other License 14
More...
GNU Free Documentation License 4
Open Source Hardware 1

Translations

Programming Language

Status

Production/Stable 316
Beta 275
Alpha 166
Pre-Alpha 94
More...
Planning 49
Mature 33
Inactive 20

Showing 1756 open source projects for "python text parser"

View related business solutions

Python Clear Filters & Widen Search

Try Google Cloud Risk-Free With $300 in Credit
No hidden charges. No surprise bills. Cancel anytime.

Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.

Start Free
Gemini 3 and 200+ AI Models on One Platform
Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.

Start Free
1

Conversational Health Agents (CHA)

A Personalized LLM-powered Agent Frameworks

CHA, or Conversational Health Agents, is an open-source framework designed to build intelligent healthcare assistants powered by large language models and external data sources. The system enables developers to create personalized AI agents that can interact with users through natural language while performing multi-step reasoning and task execution. It integrates orchestration capabilities that allow the agent to gather information from APIs, knowledge bases, and external services in order...

Downloads: 0 This Week

Last Update: 2026-03-17
See Project
2

seq2seq-couplet

Play couplet with seq2seq model

seq2seq-couplet is a deep learning application that generates Chinese couplet responses using a sequence-to-sequence model built with TensorFlow. Its purpose is not general machine translation, but a specialized text generation task in which the model produces a matching second line for a given first line in the style of traditional couplets. The repository includes the code needed to train the model, configure file paths and hyperparameters, and evaluate progress through loss and BLEU score...

Downloads: 0 This Week

Last Update: 2026-03-10
See Project
3

FlexLLMGen

Running large language models on a single GPU

FlexLLMGen is an open-source inference engine designed to run large language models efficiently on limited hardware resources such as a single GPU. The system focuses on high-throughput generation workloads where large batches of text must be processed quickly, such as large-scale data extraction or document analysis tasks. Instead of requiring expensive multi-GPU systems, the framework uses techniques such as memory offloading, compression, and optimized batching to run large models on...

Downloads: 0 This Week

Last Update: 2026-03-10
See Project
4

Youtu-GraphRAG

Vertically Unified Agents for Graph Retrieval-Augmented Reasoning

Youtu-GraphRAG is a research framework developed by Tencent for performing complex reasoning using graph-based retrieval-augmented generation. The system combines knowledge graphs, retrieval mechanisms, and agent-based reasoning into a unified architecture designed to handle knowledge-intensive tasks. Instead of relying solely on text retrieval, the framework organizes information into structured graph schemas that represent entities, relationships, and attributes. These structures allow the...

Downloads: 0 This Week

Last Update: 2026-03-09
See Project
Streamline Azure Security with Palo Alto Networks VM-Series
Centrally manage physical and virtualized firewalls with Panorama

Improve your security posture and reduce incident response time. Use the VM-Series to natively analyze Azure traffic and dynamically drive policy updates based on workload changes.

Learn more
5

LongBench

LongBench v2 and LongBench (ACL 25'&24')

LongBench is a comprehensive benchmark designed to evaluate the ability of large language models to understand and reason over very long textual contexts. Traditional language model benchmarks typically evaluate tasks involving relatively short inputs, which does not reflect many real-world applications such as analyzing large documents or entire code repositories. LongBench addresses this gap by providing datasets that require models to process and reason over long sequences of text across...

Downloads: 0 This Week

Last Update: 2026-03-09
See Project
6

TAME LLM

Traditional Mandarin LLMs for Taiwan

TAME LLM is an open-source initiative focused on building and releasing large language models optimized for Traditional Mandarin and the linguistic context of Taiwan. The project includes models such as Llama-3-Taiwan-70B, which are fine-tuned versions of large transformer architectures trained on extensive corpora containing both Traditional Mandarin and English text. These models are designed to support applications such as conversational AI, knowledge retrieval, and domain-specific...

Downloads: 0 This Week

Last Update: 2026-03-09
See Project
7

cloud_enum

Multi-cloud OSINT tool for discovering public cloud resources

cloud_enum is an open source reconnaissance and OSINT tool designed to discover publicly accessible cloud resources across major cloud providers. It focuses on enumerating assets in Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform using keyword-based discovery techniques. It works by taking user-provided keywords and generating variations through mutation wordlists, then testing these combinations against common cloud service naming patterns. cloud_enum performs both...

Downloads: 0 This Week

Last Update: 6 days ago
See Project
8

LLM Colosseum

Benchmark LLMs by fighting in Street Fighter 3

LLM-Colosseum is an experimental benchmarking framework designed to evaluate the capabilities of large language models through gameplay interactions rather than traditional text-based benchmarks. The system places language models inside the environment of the classic video game Street Fighter III, where they must interpret the game state and decide which actions to perform during combat. This setup creates a dynamic environment that tests reasoning, situational awareness, and decision-making...

Downloads: 0 This Week

Last Update: 2026-03-07
See Project
9

Lagent

A lightweight framework for building LLM-based agents

Lagent is a lightweight open-source framework designed to help developers build autonomous agents powered by large language models. The framework provides tools and abstractions that allow language models to interact with external tools, execute tasks, and perform multi-step reasoning processes. Instead of using LLMs only for text generation, Lagent enables developers to transform models into agents capable of performing actions such as retrieving data, executing code, or interacting with...

Downloads: 0 This Week

Last Update: 2026-05-13
See Project
Our Free Plans just got better! | Auth0
With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.

Try free now
10

Skywork-R1V4

Skywork-R1V is an advanced multimodal AI model series

Skywork-R1V is an open-source multimodal reasoning model designed to extend the capabilities of large language models into vision-language tasks that require complex logical reasoning. The project introduces a model architecture that transfers the reasoning abilities of advanced text-based models into visual domains so the system can interpret images and perform multi-step reasoning about them. Instead of retraining both language and vision models from scratch, the framework uses a...

Downloads: 0 This Week

Last Update: 2026-03-05
See Project
11

AgentBench

A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)

AgentBench is an open-source benchmark designed to evaluate the capabilities of large language models when used as autonomous agents. Unlike traditional language model benchmarks that focus on static text tasks, AgentBench measures how models perform in interactive environments that require planning, reasoning, and decision-making. The benchmark includes multiple environments that simulate realistic scenarios such as web interaction, database querying, and problem solving tasks. These...

Downloads: 0 This Week

Last Update: 2026-03-05
See Project
12

Sparrow

Structured data extraction and instruction calling with ML, LLM

Sparrow is an open-source platform designed to extract structured information from documents, images, and other unstructured data sources using machine learning and large language models. The system focuses on transforming complex documents such as invoices, receipts, forms, and scanned pages into structured formats like JSON that can be processed by downstream applications. It combines several components, including OCR pipelines, vision-language models, and LLM-based reasoning modules to...

Downloads: 0 This Week

Last Update: 16 hours ago
See Project
13

Dash Data Agent

Self-learning data agent that grounds its answers in layers of content

Dash is a self-learning data agent built by the Agno AI community that generates grounded answers to English queries over structured data by synthesizing SQL and reasoning based on six layers of context, improving automatically with each run. It sidesteps common limitations of simple text-to-SQL agents by incorporating multiple context layers — including schema structure, human annotations, known query patterns, institutional knowledge from docs, machine-discovered error patterns, and live...

Downloads: 0 This Week

Last Update: 2026-04-08
See Project
14

ticket

Fast, powerful, git-native ticket tracking in a single bash script

ticket is a lightweight, git-native ticket management tool implemented as a single Bash script that brings powerful issue tracking directly into your Git workflows without requiring a database or complex setup. It stores each ticket as a Markdown file with YAML frontmatter, making them human-readable and easy to version control alongside your code, while also allowing IDEs to jump straight to ticket definitions. The CLI provides common subcommands to create, list, edit, close, and manage...

Downloads: 0 This Week

Last Update: 2026-02-03
See Project
15

Chinese-LLaMA-Alpaca-3

Chinese Llama-3 LLMs) developed from Meta Llama 3

Chinese-LLaMA-Alpaca-3 is an open-source project that provides Mandarin-focused large language models based on Meta’s LLaMA-3 architecture, with both foundational and instruction-tuned variants to support high-quality Chinese natural language understanding and generation. It extends the original LLaMA models with expanded Chinese vocabularies and additional pretraining on Chinese corpora to improve semantic encoding and decoding specifically for Chinese text. Alongside the base models, the...

Downloads: 0 This Week

Last Update: 2026-01-15
See Project
16

nanochat

The best ChatGPT that $100 can buy

nanochat is a from-scratch, end-to-end “mini ChatGPT” that shows the entire path from raw text to a chatty web app in one small, dependency-lean codebase. The repository stitches together every stage of the lifecycle: tokenizer training, pretraining a Transformer on a large web corpus, mid-training on dialogue and multiple-choice tasks, supervised fine-tuning, optional reinforcement learning for alignment, and finally efficient inference with caching. Its north star is approachability and...

Downloads: 0 This Week

Last Update: 2026-05-05
See Project
17

Mozc Devices

Circuit diagrams and firmware source code for Gboard DIY keyboards

mozc-devices is an open source collection of circuit diagrams, firmware, and technical documentation for a series of experimental and often humorous Gboard and Google Japanese Input hardware keyboards, many of which were originally released as April Fools’ projects by Google Japan. Each subproject in the repository corresponds to a unique input device prototype, including versions such as the Drum Set, Morse Code, Patapata, Magic Hand, Piropiro, Physical Flick, Puchi Puchi, Nazoru, Mageru,...

Downloads: 0 This Week

Last Update: 2 days ago
See Project
18

ArXiv MCP Server

A Model Context Protocol server for searching and analyzing arXiv

arxiv-mcp-server bridges AI assistants and the arXiv repository through a clean MCP interface, enabling search, metadata retrieval, and content access without bespoke scraping. With simple tools like “search” and “fetch,” an agent can find papers, pull abstracts, and download PDFs for downstream summarization or analysis. The project includes packaging and CI to publish to PyPI, plus tests and linting for reliability. Issue threads show feature requests such as extracting embedded LaTeX and...

Downloads: 0 This Week

Last Update: 2026-04-26
See Project
19

ML Ferret

Refer and Ground Anything Anywhere at Any Granularity

Ferret is Apple’s end-to-end multimodal large language model designed specifically for flexible referring and grounding: it can understand references of any granularity (boxes, points, free-form regions) and then ground open-vocabulary descriptions back onto the image. The core idea is a hybrid region representation that mixes discrete coordinates with continuous visual features, so the model can fluidly handle “any-form” referring while maintaining precise spatial localization. The repo...

Downloads: 0 This Week

Last Update: 2025-10-08
See Project
20

fairseq2

FAIR Sequence Modeling Toolkit 2

fairseq2 is a modern, modular sequence modeling framework developed by Meta AI Research as a complete redesign of the original fairseq library. Built from the ground up for scalability, composability, and research flexibility, fairseq2 supports a broad range of language, speech, and multimodal content generation tasks, including instruction fine-tuning, reinforcement learning from human feedback (RLHF), and large-scale multilingual modeling. Unlike the original fairseq—which evolved into a...

Downloads: 0 This Week

Last Update: 2026-03-26
See Project
21

Multimodal

TorchMultimodal is a PyTorch library

This project, also known as TorchMultimodal, is a PyTorch library for building, training, and experimenting with multimodal, multi-task models at scale. The library provides modular building blocks such as encoders, fusion modules, loss functions, and transformations that support combining modalities (vision, text, audio, etc.) in unified architectures. It includes a collection of ready model classes—like ALBEF, CLIP, BLIP-2, COCA, FLAVA, MDETR, and Omnivore—that serve as reference...

Downloads: 0 This Week

Last Update: 2026-01-12
See Project
22

MetaCLIP

ICLR2024 Spotlight: curation/training code, metadata, distribution

MetaCLIP is a research codebase that extends the CLIP framework into a meta-learning / continual learning regime, aiming to adapt CLIP-style models to new tasks or domains efficiently. The goal is to preserve CLIP’s strong zero-shot transfer capability while enabling fast adaptation to domain shifts or novel class sets with minimal data and without catastrophic forgetting. The repository provides training logic, adaptation strategies (e.g. prompt tuning, adapter modules), and evaluation...

Downloads: 0 This Week

Last Update: 2025-10-07
See Project
23

DreamCraft3D

Official implementation of DreamCraft3D

DreamCraft3D is DeepSeek’s generative 3D modeling framework / model family that likely extends their earlier 3D efforts (e.g. Shap-E or Point-E style models) with more capability, control, or expression. The name suggests a “dream crafting” metaphor—users probably supply textual or image prompts and generate 3D assets (point clouds, meshes, scenes). The repository includes model code, inference scripts, sample prompts, and possibly dataset preparation pipelines. It may integrate rendering or...

Downloads: 0 This Week

Last Update: 2025-10-03
See Project
24

Gretel Synthetics

Synthetic data generators for structured and unstructured text

Unlock unlimited possibilities with synthetic data. Share, create, and augment data with cutting-edge generative AI. Generate unlimited data in minutes with synthetic data delivered as-a-service. Synthesize data that are as good or better than your original dataset, and maintain relationships and statistical insights. Customize privacy settings so that data is always safe while remaining useful for downstream workflows. Ensure data accuracy and privacy confidently with expert-grade reports....

Downloads: 0 This Week

Last Update: 2025-03-17
See Project
25

Video Diffusion - Pytorch

Implementation of Video Diffusion Models

Implementation of Video Diffusion Models, Jonathan Ho's new paper extending DDPMs to Video Generation - in Pytorch. Implementation of Video Diffusion Models, Jonathan Ho's new paper extending DDPMs to Video Generation - in Pytorch. It uses a special space-time factored U-net, extending generation from 2D images to 3D videos. 14k for difficult moving mnist (converging much faster and better than NUWA) - wip. Any new developments for text-to-video synthesis will be centralized at...

Downloads: 0 This Week

Last Update: 2024-05-03
See Project

Previous
17
18
19
20
You're on page 21
22
23
24
25
Next

Related Searches

firmware

magic

Related Categories

Artificial Intelligence

Software Development

Text Editors

Internet

Multimedia

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise