Showing 18 open source projects for "text batch processing tools"

View related business solutions
  • Train ML Models With SQL You Already Know Icon
    Train ML Models With SQL You Already Know

    BigQuery automates data prep, analysis, and predictions with built-in AI assistance.

    Build and deploy ML models using familiar SQL. Automate data prep with built-in Gemini. Query 1 TB and store 10 GB free monthly.
    Try Free
  • Fully Managed MySQL, PostgreSQL, and SQL Server Icon
    Fully Managed MySQL, PostgreSQL, and SQL Server

    Automatic backups, patching, replication, and failover. Focus on your app, not your database.

    Cloud SQL handles your database ops end to end, so you can focus on your app.
    Try Free
  • 1
    newspaper4k

    newspaper4k

    Python library for scraping and analyzing online news articles easily

    ...Newspaper4k also includes natural language processing capabilities that can generate summaries and identify keywords from extracted article text. Newspaper4k supports both single-article extraction and full news site processing, allowing users to build sources representing entire publications and iterate through their articles. It maintains compatibility with the original project so that existing code written for newspaper3k can continue working with minimal changes.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2

    Feign

    Make writing Java http clients easier

    ...Inspired by previous projects Retrofit, JAXRS-2.0 and WebSocket, Feign was designed to reduce the complexity that is often involved in binding the Denominator uniformly to HTTP APIs, no matter the ReSTfulness. Feign works by processing annotations into a templatized request, to which arguments are applied in a straightforward manner before output. While it may only support text-based APIs, it simplifies system aspects dramatically and makes it much easier to unit test your conversions. Feign makes use of great tools like Jersey and CXF for writing Java clients for ReST or SOAP services. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 3
    Browserless

    Browserless

    The headless Chrome/Chromium driver on top of Puppeteer

    Browserless is an open-source headless browser automation library and service built on top of Puppeteer that simplifies the process of running and scaling Chromium-based browser tasks in production environments. It provides a high-level API for interacting with headless Chrome, allowing developers to perform operations such as generating PDFs, capturing screenshots, extracting text or HTML, and automating web navigation. The project is designed to act as a production-ready abstraction layer...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    BlogHelper

    BlogHelper

    Helps user write blog content and entries

    A tray assistant that helps domestic users to write, publish local articles to mainstream blog platforms (Zhihu, Jianshu, Blog Park, CSDN, SegmentFault, Nuggets, Open Source China) with one click, upload clipboard pictures to Tubed (Sina, Github, Picture Shell, Tencent Cloud, Alibaba Cloud, Youpai Cloud, Qiniu Cloud). A little assistant that does not have any interface and is only stored in the system tray, to help more people write better! One-click publishing of local articles to Zhihu,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Streamline Azure Security with Palo Alto Networks VM-Series Icon
    Streamline Azure Security with Palo Alto Networks VM-Series

    Centrally manage physical and virtualized firewalls with Panorama

    Improve your security posture and reduce incident response time. Use the VM-Series to natively analyze Azure traffic and dynamically drive policy updates based on workload changes.
    Learn more
  • 5
    iText®, a JAVA PDF library

    iText®, a JAVA PDF library

    PDF Library for Developers

    iText is an open-source PDF library available for Java and .NET (C#). iText allows you to effortlessly generate and manipulate standards-compliant PDF documents with a powerful and feature-rich SDK. With iText, you can create archivable and accessible PDFs, split and merge documents, fill and flatten forms, digitally sign documents, and more. iText add-ons enable additional functionality, such as PDF creation from HTML templates, secure redaction, OCR, and much more. The latest...
    Leader badge
    Downloads: 200 This Week
    Last Update:
    See Project
  • 6
    Command-line/Ant-task/embeddable text file preprocessor. Macros, flow control, expressions. Recursive directory processing. Extensible in Java to display data from any data sources (as database). Can generate complete homepages (tree of HTML-s, images, etc.)
    Leader badge
    Downloads: 108 This Week
    Last Update:
    See Project
  • 7
    SEO & SEM - Marketing Text Writer

    SEO & SEM - Marketing Text Writer

    Open Source SEO & SEM Text Creation Tools for free Article Writer

    Open Source Tool for Search Engine Optimization (SEO & SEM) used for automatic content processing. These SEO Content Genrators and Article Writers based on Text...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    This project aims to build a suite of Natural Language Processing tools. Modules will include corpus indexing and access tools, a part-of-speech tagger, tokenisers, text classification software, etc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Plugins for Firefox and Google Chrome that automates usage of „Typograf“ service hosted at http://www.artlebedev.ru/tools/typograf/. Plugin takes text from any text area in Firefox and processes it according to typographic rules (e.g. inserts typ
    Downloads: 0 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.
    Start Free
  • 10
    XVCL is a general-purpose language for configuring variants in all sorts of textual documents (including programs). It is based on frame technology. XVCL processor automates the customization process to produce system from specification of variants.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    hypKNOWsys aims at developing a Java-based workbench for knowledge discovery and knowledge management. Currently, hypKNOWsys has released two intermediate tools: DIAsDEM Workbench (text mining for semantic tagging) and WUMprep (Web mining pre-processing)
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    The objective of the OpenBerg Project is to develop Open-Source, Open-Standards-based, Multi-Platform tools for eBook authors, editors and users. We are currently working on OpenBerg Lector, an e-Book reader, and OpenBerg Rector, an e-Book compiler.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Collection of tools for input, reading, processing, and typesetting Taiwanese language. Includes SCIM and quail input methods, Firefox dictionary plugin, plus scripts for LaTeX and HTML generation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    (Almost) all a scholar in the Humanities needs (polytonic Greek fonts, stylistic and metrical analysis tools, search engines on TLG and PHI) concentrated in only one Linux Live CD, ready to use everywhere at home or at University, without installation
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    pdfplayground is a collection of tools written in python to read/write PDF files.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    tgen generates a Web site from a collection of input files of a variety of types, using a set of registered HTML autogenerators. Cvs-Brancher allows scheduling of web deployments. vwebedit provides web-based editing of cvs repositories.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Project to create a unified FAQ XML format with all applicable software to convert it to various formats, such as multiple forms of HTML, TeX, PDF, text files, etc. Useful for most of "FAQ keepers" on various forums and discussion lists.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    The DocConversion project provides a distributed document conversion solution with a well defined API which makes use of existing convstion tools and/or a centralized conversion server. This is part of the PRONIR research at http://www.pronir.nl
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB