Showing 16 open source projects for "batch text processing"

View related business solutions
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • Go From Idea to Deployed AI App Fast Icon
    Go From Idea to Deployed AI App Fast

    One platform to build, fine-tune, and deploy. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 1
    WebHarvest - web data extraction tool
    Web data extraction (web data mining, web scraping) tool. It leverages well proved XML and text processing techologies in order to easely extract useful data from arbitrary web pages.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 2
    cpDetector is a proxy for codepage detection of documents. It delegates to multiple instances that try to detect the codepage by different techinques. A command line executeable is shipped that allows to sort documents by codepage.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 3
    SEO & SEM - Marketing Text Writer

    SEO & SEM - Marketing Text Writer

    Open Source SEO & SEM Text Creation Tools for free Article Writer

    Open Source Tool for Search Engine Optimization (SEO & SEM) used for automatic content processing. These SEO Content Genrators and Article Writers based on Text Writer: https://www.artikelschreiber.com/en/ https://www.unaique.net/en/ https://www.unaique.com/ https://www.artikelschreiben.com/ https://www.buzzerstar.com/ https://googleduplicatecontentsolver.sourceforge.io/ https://inkassos.github.io/inkasso/ https://www.artikelschreiber.com/opensource/ https://www.sebastianenger.com/ https://www.artikelschreiber.com/marketing/review/ https://muckrack.com/markus-muller https://linktr.ee/textgenerator Code Contains: - Perl Source code, language databases and more
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4

    SE Auditor

    Free SEO audit software.

    SE Auditor is a program for analyzing web pages for search engines. SE Auditor is application that you can use to view statistical data about your website, in order to improve its position within the Web search results. SE Auditor is addressed to SEO professionals, website designers, developers, website testers and owners. SE Auditor enables you to check meta description, keywords, sitemap, the number of links and keyword consistency, the text/HTML ratio and many more ranking /...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 5
    This project aims to build a suite of Natural Language Processing tools. Modules will include corpus indexing and access tools, a part-of-speech tagger, tokenisers, text classification software, etc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Provide a robust and efficient implementation of n-gram based classifiers to Java. N-Gram algorithms have shown to be surprisingly good at tasks like guessing the language/encoding from an arbitrary text file. And there are many more applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    PDFBox is a Java PDF Library. This project will allow access to all of the components in a PDF document. More PDF manipulation features will be added as the project matures. This ships with a utility to take a PDF document and output a text file.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 8
    (Almost) all a scholar in the Humanities needs (polytonic Greek fonts, stylistic and metrical analysis tools, search engines on TLG and PHI) concentrated in only one Linux Live CD, ready to use everywhere at home or at University, without installation
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Prototype for a framework and user interface for combining various structured search and document clustering techniques.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Push Code. Get a Production URL. Done. Icon
    Push Code. Get a Production URL. Done.

    Cloud Run deploys any language instantly. Scales to zero. Pay only when code runs.

    Skip the Kubernetes configs. Cloud Run handles HTTPS, scaling, and infrastructure automatically. Two million requests free per month.
    Try Cloud Run Free
  • 10
    Fast Local File Search Using Lucene, HTMLParser and Highlighter Support Chinese now
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    The "Universal Content Evaluation and Categorisation Software" is a program for analysing a website’s, or more generally, a text’s content. The text is arranged in dozens of categories, permitting more efficient web searches and information processing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    The DocConversion project provides a distributed document conversion solution with a well defined API which makes use of existing convstion tools and/or a centralized conversion server. This is part of the PRONIR research at http://www.pronir.nl
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    This code supplies miniature pedagogical Java implementations of information retrieval, spidering, and text-processing software. It was initially developed for an introductory course on Intelligent Information Retrieval and Web Search in UT Austin.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    BTR Wizard quickly replaces multiple occurances of text over multiple files. This unique program scans folders for files matching filter critera then searches those files for any occurances of a text string and replaces them all. This is an ideal tool fo
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    A highlighter for XML documents, written in Java. Uses regular expressions to search a set of DOM nodes, and transparently handles highlighting matches that span multiple elements. Highlight events are passed to a user defined highlighter for processing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    A WebCrawler for Natural Language Processing. This WebCrawler searches for monolingual (in a specified language) and bilingual, parallel text.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB