Showing 1834 open source projects for "python text parser"

View related business solutions
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    Build gen AI apps with an all-in-one modern database: MongoDB Atlas

    MongoDB Atlas provides built-in vector search and a flexible document model so developers can build, scale, and run gen AI apps without stitching together multiple databases. From LLM integration to semantic search, Atlas simplifies your AI architecture—and it’s free to get started.
    Start Free
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 1
    Text Generation Web UI

    Text Generation Web UI

    A gradio web UI for running Large Language Models like LLaMA

    ... efficient text streaming. Parameter presets, 8-bit mode. Layers splitting across GPU(s), CPU, and disk. CPU mode, FlexGen, DeepSpeed ZeRO-3, API with streaming and without streaming. LLaMA model, including 4-bit GPTQ. RWKV model, LoRA (loading and training), Softprompts, and extensions.
    Downloads: 92 This Week
    Last Update:
    See Project
  • 2
    py-pdf-parser

    py-pdf-parser

    A Python tool to help extracting information from structured PDFs

    py-pdf-parser is a Python tool designed to help extract information from structured PDFs. It provides a simple interface to define parsing rules and extract data from PDF documents. ​
    Downloads: 5 This Week
    Last Update:
    See Project
  • 3
    text-dedup

    text-dedup

    All-in-one text de-duplication

    text-dedup is a Python library that enables efficient deduplication of large text corpora by using MinHash and other probabilistic techniques to detect near-duplicate content. This is especially useful for NLP tasks where duplicated training data can skew model performance. text-dedup scales to billions of documents and offers tools for chunking, hashing, and comparing text efficiently with low memory usage. It supports Jaccard similarity thresholding, parallel execution, and flexible...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    tika-python

    tika-python

    Python binding to the Apache Tika™ REST services

    A Python port of the Apache Tika library that makes Tika available using the Tika REST Server. This makes Apache Tika available as a Python library, installable via Setuptools, Pip and easy to install. To use this library, you need to have Java 7+ installed on your system as tika-python starts up the Tika REST server in the background. To get this working in a disconnected environment, download a tika server file (both tika-server.jar and tika-server.jar.md5, which can be found here) and set...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Enterprise-grade ITSM, for every business Icon
    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

    Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
    Try it Free
  • 5
    Python Progressbar

    Python Progressbar

    Progressbar 2 - A progress bar for Python 2 and Python 3

    A text progress bar is typically used to display the progress of a long-running operation, providing a visual cue that processing is underway. The progressbar is based on the old Python progressbar package that was published on the now-defunct Google Code. Since that project was completely abandoned by its developer and the developer did not respond to my email, I decided to fork the package. This package is still backward compatible with the original progressbar package so you can safely use...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    MCP Text Editor

    MCP Text Editor

    Provides line-oriented text file editing capabilities

    The MCP Text Editor Server provides line-oriented text file editing capabilities through a standardized API, optimized for integration with Large Language Models (LLMs). It enables efficient partial file access, minimizing token usage while ensuring safe concurrent editing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Text Generation Inference

    Text Generation Inference

    Large Language Model Text Generation Inference

    Text Generation Inference is a high-performance inference server for text generation models, optimized for Hugging Face's Transformers. It is designed to serve large language models efficiently with optimizations for performance and scalability.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Recognizers-Text

    Recognizers-Text

    Recognition and resolution of numbers, units, date/time, etc.

    Recognizers-Text is a multilingual text recognition library that extracts structured information such as dates, numbers, and currency values from unstructured text.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    python-bibtexparser v2

    python-bibtexparser v2

    Bibtex parser for Python 3

    Welcome to python-bibtexparser, a parser for .bib files with a long history and wide adaption. Bibtexparser is available in two versions: V1 and V2. For new projects, we recommend using v2 which, in the long run, will provide an overall more robust and faster experience. For now, however, note that v2 is an early beta, and does not contain all features of v1.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Test your software product anywhere in the world Icon
    Test your software product anywhere in the world

    Get feedback from real people across 190+ countries with the devices, environments, and payment instruments you need for your perfect test.

    Global App Testing is a managed pool of freelancers used by Google, Meta, Microsoft, and other world-beating software companies.
    Try us today.
  • 10
    Python Client For NLP Cloud

    Python Client For NLP Cloud

    NLP Cloud serves high performance pre-trained or custom models for NER

    NLP Cloud serves high performance pre-trained or custom models for NER, sentiment-analysis, classification, summarization, dialogue summarization, paraphrasing, intent classification, product description and ad generation, chatbot, grammar and spelling correction, keywords and keyphrases extraction, text generation, image generation, blog post generation, source code generation, question answering, automatic speech recognition, machine translation, language detection, semantic search, semantic...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Bot Framework SDK for Python

    Bot Framework SDK for Python

    Build and connect intelligent bots that interact naturally

    This repository contains code for the Python version of the Microsoft Bot Framework SDK, which is part of the Microsoft Bot Framework - a comprehensive framework for building enterprise-grade conversational AI experiences. This SDK enables developers to model conversation and build sophisticated bot applications using Python. SDKs for JavaScript and .NET are also available. The Microsoft Bot Framework provides what you need to build and connect intelligent bots that interact naturally wherever...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    Certbot

    Certbot

    Get free HTTPS certificates forever from Let's Encrypt

    Certbot is a fully-featured, easy-to-use, extensible client for the Let's Encrypt CA. It fetches a digital certificate from Let’s Encrypt, an open certificate authority launched by the EFF, Mozilla, and others. This certificate then lets browsers verify the identity of web servers and ensures secure communication over the Web. Obtaining and maintaining a certificate is usually such a hassle, but with Certbot and Let’s Encrypt it becomes automated and hassle-free. With just a few simple...
    Downloads: 230 This Week
    Last Update:
    See Project
  • 13
    Fooocus

    Fooocus

    Focus on prompting and generating

    Fooocus is an open-source image generation software that simplifies the process of creating images from text prompts. Built on Gradio and leveraging Stable Diffusion XL, Fooocus eliminates the need for manual parameter tweaking, allowing users to focus solely on crafting prompts. It offers a user-friendly interface with minimal setup, making advanced image synthesis accessible to a broader audience.
    Downloads: 113 This Week
    Last Update:
    See Project
  • 14
    Whisper

    Whisper

    Robust Speech Recognition via Large-Scale Weak Supervision

    Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification. A Transformer sequence-to-sequence model is trained on various speech processing tasks, including multilingual speech recognition, speech translation, spoken language identification, and voice activity detection. These tasks are jointly represented as a...
    Downloads: 85 This Week
    Last Update:
    See Project
  • 15
    Termux application

    Termux application

    Terminal emulator application for Android OS extendible

    Termux is an Android terminal application and Linux environment. At first start a small base system is downloaded, desired packages can then be installed using the apt package manager known from the Debian and Ubuntu Linux distributions. Access the built-in help by long-pressing anywhere on the terminal and selecting the Help menu option to learn more. Allows the app to view information about network connections such as which networks exist and are connected. Allows the app to create network...
    Downloads: 106 This Week
    Last Update:
    See Project
  • 16
    CadQuery

    CadQuery

    A python parametric CAD scripting framework based on OCCT

    CadQuery is an intuitive, easy-to-use Python library for building parametric 3D CAD models. It has several goals. Build models with scripts that are as close as possible to how you’d describe the object to a human, using a standard, already established programming language. Create parametric models that can be very easily customized by end users. Output high-quality CAD formats like STEP and AMF in addition to traditional STL. Provide a non-proprietary, plain text model format that can...
    Downloads: 82 This Week
    Last Update:
    See Project
  • 17
    Neovim

    Neovim

    Hyperextensible Vim-based text editor

    Neovim is a hyperextensible text editor based on Vim. It seeks to maximize usability and extensibility, simplify maintenance and encourage contributions.
    Downloads: 63 This Week
    Last Update:
    See Project
  • 18
    Video-subtitle-extractor

    Video-subtitle-extractor

    A GUI tool for extracting hard-coded subtitle (hardsub) from videos

    Video hard subtitle extraction, generate srt file. There is no need to apply for a third-party API, and text recognition can be implemented locally. A deep learning-based video subtitle extraction framework, including subtitle region detection and subtitle content extraction. A GUI tool for extracting hard-coded subtitles (hardsub) from videos and generating srt files. Use local OCR recognition, no need to set up and call any API, and do not need to access online OCR services such as Baidu...
    Downloads: 66 This Week
    Last Update:
    See Project
  • 19
    OCRmyPDF

    OCRmyPDF

    OCRmyPDF adds an OCR text layer to scanned PDF files

    OCRmyPDF adds an optical character recognition (OCR) text layer to scanned PDF files, allowing them to be searched. PDF is the best format for storing and exchanging scanned documents. Unfortunately, PDFs can be difficult to modify. OCRmyPDF makes it easy to apply image processing and OCR (recognized, searchable text) to existing PDFs.
    Downloads: 66 This Week
    Last Update:
    See Project
  • 20
    JupyterLab

    JupyterLab

    JupyterLab computational environment

    JupyterLab is the next-generation web-based user interface for Project Jupyter. Try it on Binder. JupyterLab follows the Jupyter Community Guides. JupyterLab enables you to work with documents and activities such as Jupyter notebooks, text editors, terminals, and custom components in a flexible, integrated, and extensible manner. You can arrange multiple documents and activities side by side in the work area using tabs and splitters. Documents and activities integrate with each other, enabling...
    Downloads: 67 This Week
    Last Update:
    See Project
  • 21
    sherpa-onnx

    sherpa-onnx

    Speech-to-text, text-to-speech, and speaker recognition

    Speech-to-text, text-to-speech, and speaker recognition using next-gen Kaldi with onnxruntime without an Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift, Dart, JavaScript, Flutter.
    Downloads: 33 This Week
    Last Update:
    See Project
  • 22
    AUTOMATIC1111 Stable Diffusion web UI
    AUTOMATIC1111's stable-diffusion-webui is a powerful, user-friendly web interface built on the Gradio library that allows users to easily interact with Stable Diffusion models for AI-powered image generation. Supporting both text-to-image (txt2img) and image-to-image (img2img) generation, this open-source UI offers a rich feature set including inpainting, outpainting, attention control, and multiple advanced upscaling options. With a flexible installation process across Windows, Linux...
    Downloads: 45 This Week
    Last Update:
    See Project
  • 23
    Open-Sora

    Open-Sora

    Open-Sora: Democratizing Efficient Video Production for All

    Open-Sora is an open-source initiative aimed at democratizing high-quality video production. It offers a user-friendly platform that simplifies the complexities of video generation, making advanced video techniques accessible to everyone. The project embraces open-source principles, fostering creativity and innovation in content creation. Open-Sora provides tools, models, and resources to create high-quality videos, aiming to lower the entry barrier for video production and support diverse...
    Downloads: 34 This Week
    Last Update:
    See Project
  • 24
    FastAPI

    FastAPI

    FastAPI framework, high performance, easy to learn, fast to code

    FastAPI is a modern, fast (high-performance), web framework for building APIs with Python 3.6+ based on standard Python type hints. Great editor support. Completion everywhere. Less time debugging. Designed to be easy to use and learn. Less time reading docs. Minimize code duplication. Multiple features from each parameter declaration. Fewer bugs. Get production-ready code. With automatic interactive documentation. Based on (and fully compatible with) the open standards for APIs: OpenAPI...
    Downloads: 25 This Week
    Last Update:
    See Project
  • 25
    Coqui TTS

    Coqui TTS

    A deep learning toolkit for Text-to-Speech, battle-tested in research

    TTS is a library for advanced Text-to-Speech generation. It's built on the latest research, was designed to achieve the best trade-off among ease-of-training, speed and quality. TTS comes with pre-trained models, tools for measuring dataset quality and is already used in 20+ languages for products and research projects. High-performance Deep Learning models for Text2Speech tasks. Text2Spec models (Tacotron, Tacotron2, Glow-TTS, SpeedySpeech). Speaker Encoder to compute speaker embeddings...
    Downloads: 29 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.