Showing 8 open source projects for "text extractor"

View related business solutions
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • Ship Agents Faster Icon
    Ship Agents Faster

    Transform your applications and workflows into powerful agentic systems at global scale.

    Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.
    Get Started Free
  • 1
    Video-subtitle-extractor

    Video-subtitle-extractor

    A GUI tool for extracting hard-coded subtitle (hardsub) from videos

    Video hard subtitle extraction, generate srt file. There is no need to apply for a third-party API, and text recognition can be implemented locally. A deep learning-based video subtitle extraction framework, including subtitle region detection and subtitle content extraction. A GUI tool for extracting hard-coded subtitles (hardsub) from videos and generating srt files. Use local OCR recognition, no need to set up and call any API, and do not need to access online OCR services such as Baidu...
    Downloads: 45 This Week
    Last Update:
    See Project
  • 2
    Trafilatura

    Trafilatura

    Python & command-line tool to gather text on the Web

    Trafilatura is a Python package and command-line tool designed to gather text on the Web. It includes discovery, extraction and text-processing components. Its main applications are web crawling, downloads, scraping, and extraction of main texts, metadata and comments. It aims at staying handy and modular: no database is required, the output can be converted to various commonly used formats. Going from raw HTML to essential parts can alleviate many problems related to text quality, first by avoiding the noise caused by recurring elements (headers, footers, links/blogroll etc.) and second by including information such as author and date in order to make sense of the data. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    StyleTTS 2

    StyleTTS 2

    Towards Human-Level Text-to-Speech through Style Diffusion

    ...StyleTTS2 supports both single-speaker and multi-speaker configurations, with the ability to sample or transfer styles from reference audio, making it powerful for expressive TTS and character voices. The repository includes training scripts, configuration files, and pre-trained auxiliary modules such as a text aligner, pitch extractor, and PL-BERT-based linguistic encoder.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    MahaKurawa.My.ID URL Extractor

    MahaKurawa.My.ID URL Extractor

    MahaKurawa.My.ID URL Extractor is Simple Tool to extract unique URL

    MahaKurawa.My.ID URL Extractor is Simple Tool to extract unique URL from any text content in instant. It's useful when you lazy enough to identify and copy-paste URL from your content one by one by yourself.
    Downloads: 0 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    DrQA

    DrQA

    Reading Wikipedia to Answer Open-Domain Questions

    DrQA is an open-domain question answering system that reads large text corpora—famously Wikipedia—to answer natural language questions with extractive spans. It follows a two-stage pipeline: a fast document retriever first narrows down candidate articles, and a neural machine reader then predicts the exact answer span from those passages. The retriever relies on classic IR features (like TF-IDF and n-gram statistics) to remain lightweight and scalable to millions of documents. The reader is...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Lioness (Languages Interop Framework)
    Framework for making Windows applications that are one .exe file in AutoHotKey_L,C++,C#, VB.NET,Java,Groovy,Common Lisp,Nemerle,Ruby,Python,PHP,Lua,Tcl,Perl,Jint,S#,WSH VBScript,HTML/JavaScript/CSS,COM, PowerShell without compiling . For .NET 4.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Java exception extractor. This utility will parse all files (either plain text or bzipped) and tries to search for various exceptions. It then tries to match exceptions against grouping rules (regexps). It is also able to group unrecognised exceptions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Tools for extracting and transforming XML-like mark-up, embedded in source code comments, into proper external entities or well-formed XML files. Can be used for JavaDoc-like "literate programming", or embedding other build-related or CM metadata.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
Auth0 Logo